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Prevention  of  traffic  analysis  will  be  of  considerable  importance  in  the  commu- 
nication subsystems  of  the  future  as  the  migration  towards  use  of  public  networks  for 
secure  communication  continues.  Traffic  analysis  is  a  security  compromise  in  which 
analysis  of  certain  traffic  characteristics  results  in  information  disclosure  through  in- 
ference. Traffic  analysis  counter  measures  are  concerned  with  masking  the  traffic 
characteristics  that  can  be  used  covertly  to  communicate  information  in  violation  of 
the  security  policy. 

This  dissertation  presents  a  new  model  to  prevent  traffic  analysis  without  relying 
on  link  or  network  layer  encryption.  The  model  obtains  spatially  neutral  traffic  matrix 


by  rerouting  traffic  away  from  heavily  loaded  links  and  inserting  dummy  packets  if 
necessary.  An  algorithm  to  obtain  spatially  neutral  traffic  matrix  is  presented  and 
simulation  results  compared  with  results  obtained  from  an  integer  linear  programming 
implementation. 

The  notion  of  temporal  neutrality  is  formalized  and  transmission  schedules  pro- 
posed to  ensure  that  observable  traffic  characteristics  are  temporally  neutral.  The 
static  scheduling  policy  eliminates  covert  channels  but  is  unresponsive  to  fluctua- 
tions in  system  load;  the  adaptive  scheduling  policy  seeks  to  improve  throughput 
and  provides  for  quality  of  service  guarantees  at  the  expense  of  allowing  certain 
covert  channels. 

An  analysis  of  covert  channels  shows  a  tradeoff  between  the  covert  channel 
capacity  and  the  responsiveness  of  the  system.  Formal  and  informal  techniques  to 
estimate  covert  channel  capacity  are  proposed  and  general  bounds  on  network  covert 
channel  capacity  are  derived. 

Criteria  for  auditing  network  covert  channels  are  defined  and  several  handling 
policies  are  proposed  to  lower  the  covert  channel  capacity  to  TCSEC  acceptable 
levels.  Capacities  of  network  covert  channels  are  estimated  with  and  without  handling 
policies. 

Simulation  studies  of  the  algorithm  performed  using  uniform  traffic  and  traffic 
trace  from  the  University  of  Florida  campus-wide  backbone  network  indicate  that  the 
model  can  be  effectively  implemented  in  actual  networks  to  prevent  traffic  analysis 
and  associated  covert  channels. 
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CHAPTER  1 
INTRODUCTION 


With  the  proliferation  of  computers  and  the  networks  used  to  interconnect  them, 
guarantees  of  secure  communication  are  as  important  as  the  traditional  computer  and 
information  security  assurances.  Given  the  digitization  of  information,  extensive  log- 
ical connectivity  and  easy  access  to  information  sources,  it  is  not  sufficient  to  simply 
guarantee  the  security  of  stored  information.  Information  in  transit  (as  messages) 
must  be  protected  from  unauthorized  release  and  modification,  and  the  connection 
itself  must  be  established  and  maintained  securely. 

Prevention  of  traffic  analysis  is  one  of  the  goals  of  communication  security  and 
seeks  to  prevent  an  eavesdropper  from  gaining  any  meaningful  information  about  net- 
work users'  behavior  or  objectives  by  observing  the  legitimate  traffic  on  the  network. 

A  related  issue  is  the  existence  of  covert  channels  due  to  the  variation  in  traffic 
characteristics.  The  variation  in  the  spatial  and  temporal  characteristics  of  network 
traffic  can  be  used  to  encode  symbols  to  communicate  information  via  covert  channels. 
The  Department  of  Defense  (DoD)  sponsored  Trusted  Computer  Security  Evaluation 
Criteria  (TCSEC)  and  INFOSEC  provide  technical  hardware,  firmware,  and  software 
security  criteria  and  associated  technical  evaluation  methodologies  in  support  of  the 
overall  automatic  data  processing  system  security  policy,  evaluation  and  accreditation 
functions[10,  37].  In  1993,  the  National  Computer  Security  Center  (NCSC)  published 


A  Guideline  To  Understanding  Covert  Channel  Analysis[38].  It  has  taken  almost  a 
decade  since  the  first  release  of  the  DoD  TCSEC  to  formulate  a  coherent  approach  to 
the  problem  of  covert  channels  and  is  an  indication  of  the  growing  importance  and 
interest  in  this  area. 

However,  the  covert  channel  guideline  is  surprisingly  silent  on  network  covert 
channels-the  covert  channels  that  manifest  due  to  the  effect  of  existing  high  level 
communication  protocols  on  network  traffic  characteristics. 

Covert  channels  in  communication  subsystems  that  implement  a  valid  inter- 
pretation of  a  consistent  security  policy  are  based  on  the  observation  of  the 
extrinsic  characteristics  of  the  communication  without  necessarily  having  ac- 
cess to  the  information  contained  within  messages  (due  to  encryption)  or  the 
necessity  to  modulate  internal  states  or  variables. 

We  use  the  above  definition  of  covert  channels  to  identify  covert  channels  due  to 
spatial  and  temporal  variation  in  traffic  characteristics  and  to  develop  mechanisms  to 
contain  the  capacity  of  these  covert  channels.  A  model  for  the  high  level  prevention  of 
traffic  analysis  that  presents  an  eavesdropper  with  a  completely  neutral  traffic  pattern 
is  proposed.  The  objective  of  a  neutral  traffic  matrix  is  achieved  by  rerouting  traffic 
via  other  nodes  in  the  network  and  using  dummy  packets  if  necessary. 

The  notion  of  temporal  neutrality  is  used  to  characterize  temporal  variation  in 
traffic.  Transmission  schedules  are  proposed  to  generate  temporally  neutral  transmis- 
sion characteristics.  Formal  techniques  based  on  Shannon's  information  theory[49] 
is  used  to  estimate  covert  channel  capacity  and  is  compared  with  capacity  estimates 


derived  by  informal  techniques  based  on  a  mode  based  secure  system[7],  Auditability 
of  network  covert  channels  is  studied  using  traffic  characteristics  derived  from  our 
measurements  on  UFNET.  Various  handling  techniques  are  employed  to  reduce  the 
covert  channel  capacity  to  TCSEC  acceptable  levels. 

Performance  analysis  of  the  algorithm  to  achieve  spatial  neutrality  is  compared 
with  an  integer  linear  programming  implementation  to  derive  spatial  neutral  traffic 
matrices.  Results  indicate  that  the  performance  of  the  algorithm  for  small  traffic  ma- 
trices is  within  10%  of  the  optimal  cost  and  for  larger  traffic  matrices,  the  performance 
is  within  30%  of  the  optimal  cost.  Performance  analysis  of  the  model  shows  that  it 
can  be  effectively  employed  in  actual  networks  under  moderately  loaded  conditions 
with  acceptable  overheads. 

Chapter  2,  reviews  various  definitions  used  in  covert  channel  analysis  and  sum- 
marizes the  NCSC  guideline  on  covert  channel  analysis  techniques.  Chapter  3,  intro- 
duces our  model  for  the  prevention  of  traffic  analysis  and  addresses  covert  channels 
due  to  spatial  variation  in  traffic  characteristic.  Chapter  4,  extends  this  model  to 
address  temporal  variations  in  traffic  characteristics.  Two  transmission  scheduling 
policies  that  eliminate  or  reduce  covert  channels  due  to  temporal  variation  are  pre- 
sented in  this  chapter.  A  discussion  on  the  tradeoffs  between  the  adaptability  of  the 
transmission  schedule  and  the  capacity  of  covert  channels  is  also  included.  Chapter 
5,  presents  a  formal  and  informal  analysis  of  covert  channel  capacity  under  various 
transmission  conditions.  General  bounds  are  derived  for  network  covert  channel  ca- 
pacity. Chapter  6,  discusses  the  auditability  of  network  covert  channels  and  discusses 


various  handling  policies  to  reduce  the  channel  capacity.  Traffic  characteristics  from 
measurements  done  on  the  University  of  Florida  campus-wide  backbone  network 
(UFNET)  is  used  to  study  the  auditability  of  network  covert  channels.  Chapter  7, 
analyzes  the  performance  of  the  model,  starting  with  an  uniform  traffic  matrix  and 
later  extending  it  to  use  measurements  from  UFNET.  Chapter  8,  presents  our  con- 
clusions and  suggestions  for  future  work. 
The  major  accomplishments  of  this  thesis  are 

•  Prevention  of  traffic  analysis  is  identified  as  one  of  the  objectives  of  commu- 
nication security;  this  is  the  first  formal  model  presented  to  obtain  spatially 
neutral  traffic  matrices. 

•  Characterization  of  temporal  variation  in  transmission  characteristics  and  the 
definition  of  temporal  neutrality  led  to  the  detection  of  certain  covert  channels 
that  exist  due  to  temporal  variation  in  transmission  characteristics. 

•  Development  of  transmission  scheduling  policies  to  contain,  if  not  eliminate 
network  covert  channels  due  to  temporal  variations  in  transmission  character- 
istics. 

•  Development  of  techniques  to  estimate  the  capacity  of  network  covert  chan- 
nels based  on  results  from  Shannon's  information  theory[49]  and  Millen[31]. 
We  derive  general  results  for  computing  maximum  capacity  for  network  covert 
channels  under  our  model. 


•  Auditability  of  network  covert  channels  was  studied  and  various  handling  mech- 
anisms proposed  to  reduce  covert  channel  capacity. 

•  A  heuristic  algorithm  to  obtain  spatially  neutral  traffic  matrix  was  given.  The 
algorithm's  performance  was  compared  with  an  integer  programming  implemen- 
tation. Simulation  studies  were  also  done  to  evaluate  the  performance  of  the 
algorithm  using  traffic  characteristics  based  on  measurements  done  on  UFNET 
and  synthetically  generated  traffic  (uniform  distribution). 

Every  new  network  will  pose  unique  challenges  in  terms  of  its  connectivity, 
service  provided  and  security  of  the  network  subsystem.  This  thesis  presents  a  the- 
oretical framework  to  address  one  aspect  of  network  security:  prevention  of  traffic 
analysis  and  network  covert  channels. 


CHAPTER  2 
BACKGROUND  AND  LITERATURE  SURVEY 


In  this  chapter,  we  will  review  the  goals  of  communication  security  with  an 
emphasis  on  traffic  analysis.  This  is  followed  by  a  survey  of  research  done  in  the 
related  areas  of  traffic  characterization  in  computer  networks,  traffic  analysis  and 
covert  channels  analysis  in  secure  computer  systems,  and  policies  to  handle  known 
covert  channels.  We  also  present  an  overview  of  our  model  to  prevent  traffic  analysis. 

2.1     Communication  Security  Goals 

The  primary  goal  of  a  communication  subsystem  or  network  is  to  provide  remote 
access  to  users  and  facilitate  resource  sharing,  which  increases  it  vulnerability  to  at- 
tacks by  wiretappers  and  intruders,  and  the  potential  for  covert  channels.  The  issues 
involved  in  the  security  of  communication  systems  differ  from  those  of  computer  sys- 
tems because  network  security  depends  on  factors  such  as  the  network  architecture, 
the  topology,  connectivity,  the  communication  protocols  used,  the  security  of  each 
individual  node  connected  to  the  network,  and  the  security  of  each  link  intercon- 
necting the  nodes.  Each  of  the  above  must  be  both  individually  secure  and  securely 
composable  for  the  communication  subsystem  to  be  deemed  secure. 

Potential  security  violations  in  networks  include  unauthorized  release  or  modifi- 
cation of  information,  and  unauthorized  denial  of  use  of  resources [57].  While  passive 


attacks  cause  information  release,  active  attacks  cause  information  modification  or 
denial  of  resources.  An  intruder  can  mount  an  active  attack  in  any  of  the  following 
ways: 

•  modifying  the  message  stream  either  by  modifying  the  protocol  control  infor- 
mation or  by  modifying  data  contained  within  a  message; 

•  denial  of  message  service  either  by  dropping  some  or  all  the  packets  in  an 
association  or  by  delaying  them  in  one  or  both  directions; 

•  spurious  association  initiation  either  by  replaying  an  old  association  sequence 
or  by  attempting  to  establish  an  association  under  a  false  identity. 

In  a  passive  attack,  the  intruder  simply  releases  the  contents  of  a  message  or 
mounts  a  traffic  analysis  attack  to  infer  user  behavior  and  exploit  certain  covert 
channels. 

The  goals  of  communication  security  as  stated  in  Voydock[57]  are 

•  prevention  of  release  of  message  contents; 

•  prevention  of  traffic  analysis; 

•  detection  of  message-stream  modification; 

•  detection  of  denial  of  message  service; 

•  detection  of  spurious  association  initiation. 


This  thesis  is  primarily  concerned  with  the  prevention  of  traffic  analysis  in  com- 
munication networks  and  associated  covert  channels.  To  this  end,  we  identify  the 
following  steps 

1.  Perform  a  legitimate  study  of  the  traffic  characteristics  so  that  unusual  network 
events  can  be  compared  against  the  "normal"  network  performance  and  traffic 
characteristics.  Related  objectives  are 

•  performance  considerations:  detect  any  anomalous  behavior  of  individual 
network  components; 

t  security  considerations:   identify  any  actual  or  attempted  breach  in  net- 
work security. 

2.  Based  on  the  traffic  characteristics  of  the  network  under  consideration,  adopt  a 
suitable  mechanism  to  prevent  traffic  analysis  and  implement  it  as  part  of  the 
computer  and  network  security  policy; 

3.  Identify  and  estimate  the  capacity  of  network  covert  channels  that  exist  even 
after  smoothing  traffic  patterns  to  prevent  traffic  analysis; 

4.  Devise  mechanisms  to  contain  covert  channel  capacity  and  audit  the  usage  of 
these  channels; 

5.  Analyze  the  performance  of  the  proposed  model  on  an  actual  network  (imple- 
mentation) or  use  traffic  trace  from  actual  networks  to  study  the  model  behavior 
(simulation). 
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In  the  following  section,  we  will  discuss  the  motivation  for  traffic  characterization 
and  present  a  summary  of  results  from  our  measurements  on  the  University  of  Florida 
campus-wide  backbone  network  (UFNET).  We  also  briefly  discuss  alternative  models 
for  traffic  characterization.  In  section  2.3,  we  introduce  our  model  for  the  prevention 
of  traffic  analysis.  In  section  2.4,  we  discuss  relevant  work  on  covert  channels  and 
discuss  techniques  to  estimate  their  capacity.  In  section  2.5,  various  covert  channel 
handling  policies  are  discussed. 

2.2     Traffic  Characterization 

Network  modeling,  traffic  characterization  and  performance  analysis  of  existing 
networks  are  important  because  they  help  the  network  designer  to  build  realistic  mod- 
els of  the  next  generation  of  computer  networks;  to  develop  new  and  better  network 
architectures;  to  design  a  network  analysis  program  to  search  for  system  bottlenecks; 
to  efficiently  allocate  resources  and  improve  system  throughput,  performance  and 
reliability;  and  to  design  and  implement  effective  network  security  policies. 

In  a  secure  environment,  an  accurate  characterization  of  actual  traffic  on  the  net- 
work helps  design  effective  countermeasures  to  prevent  traffic  analysis.  A  knowledge 
of  the  "normal"  statistical  properties  of  the  network  traffic  helps  the  network  man- 
ager to  detect  any  abnormal  activity  due  to  a  malfunction  of  some  component  that 
could  affect  the  performance  of  the  system,  or  to  detect  suspicious  use  of  resources 
by  legitimate  users,  or  to  detect  an  intruder  in  the  system. 

This  necessitates  real-time  traffic  measurements  and  the  collection  of  reliable 
and  representative  traffic  statistics.    User  activity  profiles  are  also  required  for  the 
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comparison  of  the  economics  of  different  communication  services.  In  general,  traffic 
characterization  is  performed  to  study 

•  the  behavior  of  networks  with  different  topologies; 

•  the  behavior  of  a  network  under  various  load  conditions; 

•  the  interactions  of  various  applications  and  its  effect  on  the  network  traffic; 

•  the  appropriateness  of  various  protocols  and  services  in  different  environments 
like  a  private  network,  a  high  speed  backbone,  a  public  carrier,  etc.; 

•  resource  bottlenecks  and  congestion  control  in  networks; 

•  efficient  integration  of  user  services;  and, 

•  allocation  of  resources  to  network  users  and  a  policy  for  charging  the  users  for 
resources  used. 

The  above  list  suggests  that  the  foremost  task  in  the  measurement  and  mod- 
eling of  network  traffic  is  to  understand  the  basic  nature  of  network  traffic  in  terms 
of  its  statistical  properties  and  to  have  a  method  to  succinctly  quantify  the  traffic 
characteristics. 

-J.J.  I      Trallic  r|i;iraclori/;iliuii  of  ITXKT 

University  of  Florida  campus-wide  backbone  network  (UFNET)  is  a  fiber  optic 
backbone  providing  a  data  highway  for  approximately  150  departmental  local  area 
networks  (LANs).  UFNET  consists  of  a  core  of  Wellfleet  routers  that  perform  two 
basic  functions:  packet  switching  and  dynamic  routing.  As  each  departmental  LAN 
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is  a  subnet,  UFNET  behaves  as  a  backbone  network  carrying  only  interdepartmental 
traffic.  Analysis  of  traffic  characteristics  is  based  on  the  TCP/IP  traffic  observed  on 
UFNET  and  is  presented  in  the  literature  [54,  3,  2,  22].  In  addition  to  monitoring 
traffic  on  UFNET,  we  performed  measurements  on  departmental  LANs  to  analyze 
intra-LAN,  inter-LAN  and  LAN-WAN  (Wide  Area  Network)  traffic  characteristics. 

UFNET  carries  a  variety  of  traffic;  our  research  concentrated  on  the  most  com- 
mon transport  and  application  protocols.  In  particular,  traffic  characteristics  of 
Transmission  Control  Protocol  (TCP)  and  User  Datagram  Protocol  (UDP)  trans- 
port protocols,  and  applications  like  Simple  Mail  Transfer  Protocol  (SMTP),  File 
Transfer  Protocol  (FTP),  Telnet,  rlogin  (terminal  traffic),  Network  News  Transfer 
Protocol  (NNTP)  and  X-server  application  protocols  were  studied. 

The  necessity  to  characterize  application  traffic  arises  due  to  the  following  con- 
cerns. Measured  interarrival  times  alone  are  not  adequate  to  characterize  a  conver- 
sation, because  interarrival  times  are  themselves  a  function  of  existing  flow  control 
mechanisms.  A  conversation  is  defined  to  be  a  stream  of  packets  traveling  between  the 
end  points  of  an  association;  an  association  is  in  turn  defined  as  a  <protocol,  source 
address,  source  port,  destination  address,  destination  port>  tuple  for  the  purpose  of 
driving  flow  and  congestion  control  algorithm  simulations [8].  Interarrival  times  ef- 
fectively characterize  interactive  traffic,  which  is  unlikely  to  be  constrained  by  flow 
control.  On  the  other  hand,  bulk  traffic  must  be  characterized  by  the  amount  of  data 
transferred;  the  observed  duration  of  the  bulk  transfers  mostly  reflects  the  effects 
of  network  link  speed  and  flow  control  algorithm.    Furthermore,  though  interactive 
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conversations  are  bidirectional,  they  send  much  more  data  in  one  direction  than  in 
the  other;  an  accurate  model  must  take  this  into  account.  Section  2.2.3  discusses  one 
such  model. 

Though  the  volume  of  traffic  on  the  backbone  has  been  growing  steadily,  switch- 
ing capacity  easily  exceeds  observed  loads;  thus  even  though  utilization  is  high,  the 
measurements  were  done  on  an  uncongested  network.  Possible  sources  of  packet  loss 
are  the  limitations  of  the  monitoring  tool  and  the  lack  of  buffer  space  at  the  switching 
nodes.  Etherfind  and  NFSwatch  were  the  tools  used  to  monitor  the  network.  The 
20  ms  (10  ±  10  ms)  resolution  of  the  Sun's  clock  prevents  a  more  detailed  view  of 
interarrival  time  in  this  range.  Better  clock  resolution  would  lead  to  a  more  accurate 
picture  of  the  traffic  characteristics[54,  3]. 

2.2.2     Quantitative  Analysis  of  Traffic 

Traffic  on  UFNET  was  characterized  by  the  distributions  of  the  number  of  bytes 
transferred,  connection  duration,  number  of  packets  transferred,  packet  sizes,  and 
packet  interarrival  times  based  on  the  volume  of  traffic  generated  by  several  ap- 
plications at  various  levels  in  inter-LAN  and  intra-LAN  network  hierarchy.  It  was 
observed  that  inter-LAN,  intra-LAN  and  LAN-WAN  traffic  demonstrate  distinctive 
patterns[2].  These  patterns  are  important  when  setting  system  parameters  like  buffer 
sizes,  selecting  routing  and  congestion  control  strategies,  and  building  traffic  genera- 
tors for  simulation  studies  of  network  and  protocol  performance[54]. 
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Similar  studies  [54,  2,  3,  17,  8,  22]  have  performed  hierarchical  quantitative 
analysis  of  network  traffic  in  terms  of  the  volume  of  traffic  on  a  backbone  network 
and  have  studied  Inter-LAN,  Intra-LAN  and  LAN-WAN  traffic  characteristics. 

2,2.3     Packet  Train  Model  to  Characterize  Traffic 

Traditionally,  models  of  packet  arrival  in  communication  networks  have  assumed 
either  Poisson  or  compound  Poisson  arrival  patterns.  A  study  of  a  token  ring  LAN 
at  MIT  found  that  packet  arrival  followed  neither  of  these  models.  Instead,  traffic 
followed  a  more  general  model  called  the  "Packet  train",  which  describes  the  network 
traffic  as  a  collection  of  packet  streams  traveling  between  pairs  of  nodes[24,  44]. 

A  packet  train  consists  of  a  number  of  packets  flowing  in  both  directions  between 
a  particular  node  pair.  Each  packet  in  a  train  is  called  a  car.  Train  length  or  the 
number  of  cars  in  a  train,  is  marked  by  the  maximum  allowed  intercar  gap  (MAIG). 
In  a  given  packet  stream,  a  packet  whose  inter-arrival  gap  exceeds  the  MAIG  is 
declared  the  first  packet  in  next  train  between  the  same  node  pair.  The  MAIG  is 
chosen  such  that  90  percent  of  the  packet  interarrival  gap  of  a  packet  stream  is  less 
than  the  MAIG.  Mean  inter-train  arrival  time  was  found  to  be  much  larger  than 
mean  interarrival  time  for  packets  within  a  single  train. 

Specific  protocols  exhibit  two  basic  differences:  intercar  gap  and  train  length. 
Locality  is  another  quantifiable  aspect  of  packet  train  model.  Locality  describes 
the  extent  of  train  overlap  and  is  defined  as  the  probability  that  the  next  received 
packet  is  from  the  same  train  as  the  current  packet.  Assuming  a  uniform  probability 
of  receiving  a  packet  from  each  node  in  a  m  node  network,  the  probability  that 
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the  next  packet  is  from  the  same  train  as  the  current  packet  is  2/m  (current  or 
opposite  direction).  The  observed  probability  was  60  percent  for  the  MIT  LAN  [24]. 
Owing  to  the  great  number  of  hosts  using  the  backbone  and  the  high  utilization  of 
UFNET,  locality  is  much  less  but  varies  inversely  with  respect  to  network  utilization. 
Furthermore,  traffic  in  transit  via  a  particular  LAN  exhibits  less  locality  than  local 
traffic  because  of  the  probable  greater  number  of  communicating  hosts  and  the  high 
degree  of  merged  traffic [54,  22]. 

The  basic  premise  of  the  packet  train  model  is  that  successive  packets  generated 
by  an  application  or  a  session  are  closely  related  vis-a-vis  their  statistical  properties. 
By  characterizing  traffic  by  applications  or  user  sessions,  we  hope  to  have  a  better 
understanding  of  the  traffic.  While  this  model  is  more  useful  than  the  hierarchical 
quantitative  analysis,  it  is  still  not  clear  how  one  can  extrapolate,  in  general,  the 
characteristics  of  few  selected  applications  measured  over  a  small  number  of  sessions 
across  certain  selected  nodes  in  the  network  to  characterize  the  highly  dynamic  and 
transient  traffic  on  a  backbone.  Also  there  are  no  tools  that  will  give  a  correct  measure 
of  application  specific  traffic.  Application  related  information  is  generally  extracted 
out  of  lower  layer  protocol  specifications,  which  raises  the  important  question  of  the 
accuracy  and  "completeness"  of  monitored  traffic.  Such  a  characterization  of  traffic, 
while  useful,  does  not  characterize  the  gross  traffic  on  the  network  comprehensively. 
However,  packet  train  analysis  is  a  very  useful  technique.  The  packet  train  model 
was  extended  to  the  session  train  model[22]. 
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2.3     Prevention  of  Traffic  Analysis 

One  of  the  goals  of  communication  security  is  the  prevention  of  traffic  analysis, 
by  which  an  intruder  may  deduce  important  information  from  the  mere  presence 
of  message  traffic  in  a  network  [25,  57,  60,  48].  This  information  may  yield  clues 
to  the  activity  or  intentions  of  unsuspecting  network  users,  or  may  provide  a  covert 
channel  for  communication  between  an  intruder  and  an  accomplice  within  the  system. 
Traffic  analysis  countermeasures  must  mask  the  amount  and  nature  of  traffic  between 
origin-destination  pairs  within  the  network.  The  precision  with  which  an  intruder 
can  analyze  these  traffic  patterns  determines  the  amount  of  information  that  he  can 
infer  about  the  network  user[63]. 

Most  previous  work  has  used  the  ISO's  (International  Standards  Organization) 
Open  system  interconnection  (OSI)  network  architecture  for  describing  threats  and 
countermeasures  [23,  57,  58,  59,  48,  65,  61];  we  will  do  the  same.  The  OSI  model 
consists  of  seven  layers,  from  the  lowest,  or  physical  layer,  through  the  data  link,  net- 
work, transport,  session  and  presentation  layers  to  the  highest,  or  application  layer 
[51].  The  bottom  three  layers  are  present  in  all  the  nodes  in  the  communication 
subnet,  providing  the  means  for  messages  to  be  conveyed  from  host  to  host.  At  the 
datalink  layer,  only  communication  between  nodes  in  immediate  contact  is  consid- 
ered, while  at  the  network  layer,  routing  within  the  subnet  is  performed,  necessitating 
at  least  destination  addresses.  The  transport  layer  is  the  lowest  layer  to  deal  with 
end-to-end  communication  between  hosts.  Higher  layers  are  concerned  with  particu- 
lar entities  residing  at  a  host  and  rely  on  the  transport  layer  to  provide  end-to-end 
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communication  services.  The  standard  approach  to  preventing  unauthorized  release 
of  information  in  a  network  is  encryption  [62].  A  major  issue  is  the  OSI  level(s)  at 
which  encryption  is  performed;  we  will  not  address  this  issue  in  this  thesis  and  refer 
the  interested  reader  to  Padlipsky  et  al.[43]. 

The  two  basic  approaches  to  communication  security  are  link-oriented  security 
measures,  which  provide  security  by  protecting  message  traffic  independently  on  each 
communication  link,  and  end-to-end  security  measures,  which  provides  protection  for 
each  message  from  its  source  to  destination  node[57]. 

Link  encryption,  performed  at  the  data  link  layer,  can  hide  source  and  destina- 
tion information,  as  well  as  message  contents.  It  can  prevent  direct  traffic  analysis 
as  long  as  the  nodes  themselves  are  secure  but  may  allow  information  about  loads 
on  each  link  to  be  learned  and  thus  allow  indirect  traffic  analysis.  Other  problems  of 
link  encryption  schemes  include  [58] 

1)  continuous  key  stream  generation  at  each  node; 

2)  ensuring  physical  security  of  all  intermediate  nodes; 

3)  difficulties  in  cost  apportionment  to  the  users. 

Each  pair  of  communicating  nodes  in  the  subnet  must  share  an  encryption  means 
and  the  keys  to  implement  it.  These  keys  must  be  distributed  securely.  If  a  node 
is  compromised,  then  all  the  data  passing  through  the  node  will  be  available  to 
the  eavesdropper,  allowing  at  the  least  direct  traffic  analysis.  Unless  messages  have 
been  encrypted  at  a  higher  level,  the  contents  of  messages  and  high  level  entities  in 
communication  will  also  be  revealed. 
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Encryption  at  each  link  is  very  costly,  both  in  hardware  and  in  time,  partic- 
ularly when  one  considers  that  the  communication  subnet  is  often  responsible  for 
transmitting  a  packet  across  many  links  between  the  source  and  destination  hosts. 
If  all  messages  must  be  encrypted,  all  users  in  the  network  share  the  cost  of  this 
service,  both  in  dollars  and  in  time  delay,  whether  or  not  they  feel  the  need  for  such 
protection. 

It  is  generally  accepted  that  performing  encryption  at  the  network  layer  will  be 
too  costly.  Encrypting  the  actual  destination  of  the  message  necessitates  sending  all 
messages  to  all  hosts  on  the  network,  using  prohibitive  amounts  of  bandwidth  and 
wasting  host  processing  power. 

End-to-end  encryption,  performed  at  the  transport  layer  (or  higher),  is  more 
suitable  as  a  security  mechanism.  Since  the  destination  address  and  possibly  the 
source  address  are  not  hidden  (these  are  part  of  the  network  layer  header),  traffic 
analysis  is  not  prevented. 

Although  performing  the  encryption  on  the  transport  layer  allows  the  intruder 
to  look  at  the  traffic  pattern  at  the  network  address  level,  he  cannot  deduce  which 
presentation  or  session  entities  are  communicating  [48].  Still,  an  intruder  may  gain 
some  useful  information  regarding  the  traffic  pattern.  The  purpose  of  this  model  is  to 
provide  a  method  by  which  an  intruder  may  be  prevented  from  deducing  any  useful 
information  from  observations  of  the  traffic,  even  though  encryption  and  all  other 
security  related  operations  are  performed  at  the  transport  layer. 
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To  achieve  protection  beyond  that  offered  by  encryption  at  the  transport  layer 
at  a  cost  less  than  that  exacted  by  sending  dummy  packets  alone,  we  propose  an 
approach  in  which  the  transport  layer  entities  may  send  a  message  to  a  destination 
other  than  the  true  destination  of  the  message.  Transport  entities  must  agree  to 
forward  these  rerouted  messages  to  their  true  destination  when  they  are  received  and 
initially  decoded.  In  this  way,  we  manipulate  the  initial  traffic  matrix  so  that  each 
host  sends  every  other  host  in  the  network  the  same  volume  of  messages,  that  is,  it 
presents  the  intruder  with  a  neutral  traffic  matrix  (defined  in  section  3.1).  Regardless 
of  the  original  traffic  pattern,  the  intruder  observing  the  manipulated  traffic  pattern 
will  see  only  even  communication  levels.  Thus,  the  intruder  cannot  derive  useful 
information  regarding  the  original  traffic  patterns.  We  call  this  the  spatial  neutrality 
criterion. 

To  summarize,  messages  from  higher  level  entities  are  encrypted  by  the  transport 
layer  (if  they  have  not  been  encrypted  already)  using  the  true  destination's  key.  The 
rerouting  schedule  for  the  sending  host  is  then  consulted  to  determine  if  this  message 
should  be  sent  directly  to  the  true  destination  or  should  be  rerouted  through  an 
intermediate  host's  transport  layer.  If  the  message  is  to  be  sent  directly  to  the  true 
destination,  then  the  encrypted  transport  layer  protocol  data  unit  (PDU)  is  passed 
to  the  network  layer  with  the  true  destination  as  the  recipient.  If  the  message  is 
to  be  rerouted,  then  the  source  and  destination  are  prepended  to  the  encrypted 
message  to  form  a  new  message.  This  entire  message  is  then  encrypted  using  the 
intermediate  host's  key  and  passed  to  the  network  layer  with  the  intermediate  host 
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as  the  apparent  recipient.  When  the  transport  layer  receives  a  message,  it  decrypts 
it  and  determines  whether  it  was  really  destined  for  it  or  for  some  other  node.  If 
the  message's  true  destination  was  some  other  node,  then  the  intermediate  host's 
transport  layer  passes  the  already  formed  and  encrypted  message  to  its  network 
layer  with  the  true  destination  as  the  apparent  recipient.  In  addition,  pad  (dummy) 
messages  may  be  sent  that  need  not  be  forwarded. 

Provided  that  the  actual  flow  of  messages  arriving  from  the  transport  layer  is 
"regular"  and  not  correlated  with  the  incoming  traffic,  the  network  layer  will  not  be 
able  to  distinguish  between  direct  messages,  pad  messages,  and  rerouted  messages  in 
either  direction.  Since  the  true  traffic  information  is  only  available  to  the  transport 
layer,  untrusted  networks  may  be  used  without  yielding  to  traffic  analysis. 

This  approach  should  be  distinguished  from  the  standard  types  of  routing  and 
rerouting  usually  done  at  the  network  layer  [23].  First,  the  routing  decisions  made 
by  the  proposed  method  are  not  the  same  sort  as  those  made  in  the  network  layer: 
only  the  apparent  destination  is  specified  by  the  transport  layer,  not  the  particular 
output  line  as  is  done  by  network  layer.  Even  in  the  case  of  source  routing,  in 
which  the  entire  route  is  specified  by  the  source  host,  the  routing  decision  is  made 
at  the  network  layer.  This  model  does  not  make  assumptions  regarding  the  ways  in 
which  routing  decisions  are  made  at  the  network  layer.  Rerouting  at  the  network 
layer  is  usually  done  to  avoid  compromised  portions  of  the  network,  and  network 
layer  packets  are  not  usually  encrypted.  The  limited  form  of  rerouting  proposed  here 
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allows  the  transport  layer  to  provide  a  measure  of  traffic  flow  confidentiality,  and 
does  not  preclude  rerouting  at  the  network  layer  to  avoid  other  attacks. 

The  basic  model  guarantees  spatial  neutrality  by  eliminating  the  variation  in 
the  relative  volume  of  traffic  between  each  pair  of  nodes.  However,  we  are  also 
concerned  with  the  temporal  variation  in  traffic  and  the  possible  introduction  of 
covert  channels  due  to  the  variation  in  the  transmission  characteristics  over  time. 
Note  that  although  the  total  volume  of  traffic  communicated  between  any  pair  of 
nodes  in  the  network  is  the  same  to  satisfy  the  spatial  neutrality  criterion,  the  source 
could  transmit  the  packets  in  a  burst  or  could  spread  out  its  transmissions  over  a 
period  of  time.  The  model  imposes  no  restriction  on  the  transmission  schedule  and 
therefore  a  knowledgeable  user  might  be  able  to  communicate  with  his  accomplice  by 
timing  the  transmissions,  thus  introducing  covert  channels. 

To  address  this  concern,  in  addition  to  requiring  that  the  traffic  matrix  be 
spatially  neutral,  we  require  the  transmission  schedule  be  temporally  neutral  to  elim- 
inate potential  covert  channels.  We  propose  two  transmission  scheduling  policies  that 
will  satisfy  our  primary  goal  of  prevention  of  traffic  analysis  and  the  prevention  of 
covert  channels  due  to  temporal  variation  in  packet  transmission  schedule.  The  static 
scheduling  policy  generates  spatially  and  temporally  neutral  transmission  schedules 
but  is  unresponsive  to  changes  in  system  load.  The  adaptive  scheduling  policy  can 
adapt  to  long  term  fluctuations  in  system  load  at  the  expense  of  allowing  certain 
covert  channels.  Scheduling  policies  are  discussed  in  chapter  4. 
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In  the  next  section,  we  continue  our  discussion  on  relevant  work  that  address 
covert  channels  and  discuss  methods  to  estimate  channel  capacity.  We  will  also 
describe  briefly  various  methods  of  handling  known  covert  channels  in  section  2.5. 

2.4     Covert  Channels 

The  National  Computer  Security  Center's  recently  published  Guide  to  Under- 
standing Covert  Channel  Analysis  of  Trusted  Systeins[iS]  is  a  comprehensive  sum- 
mary of  techniques  used  in  covert  channel  analysis  and  related  areas.  This  document 
discusses  storage  and  timing  channels  that  arise  due  to  the  sharing  of  "computational 
resource"  among  subjects  at  different  security  levels.  Examples  of  covert  storage  chan- 
nels include  "File-lock  channel",  "Table-space  exhaustion  channel",  the  "Unmount 
of  Busy  file  system  channel",  etc.,  and  examples  of  covert  timing  channel  include  the 
"CPU  quantum  channel",  "CPU  interquantum  channel",  etc. 

We  will  review  several  definitions  of  covert  storage  and  timing  channels  that 
have  been  proposed  in  the  literature. 

2.4.1     Covert  Channel  Definitions  and  Identification 

Some  definitions  of  covert  channels  are[38]: 

1.  A  communication  channel  that  allows  a  process  to  transfer  information  in  a 
manner  that  violates  the  system's  security  policy[10]. 

2.  A  covert  channel  that  involves  the  direct  or  indirect  writing  of  a  storage  location 
by  one  process  and  the  direct  or  indirect  reading  of  the  storage  location  by 
another  process.     Covert  storage  channels  typically  involve  a  finite  resource 
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(e.g.,  sectors  on  a  disk)  that  is  shared  by  two  subjects  at  different  security 
levels[10]. 

3.  A  covert  channel  in  which  one  process  signals  information  to  another  by  mod- 
ulating its  own  use  of  resourcesfe.g.,  CPU  time)  in  such  a  way  that  this  manip- 
ulation affects  the  real  response  time  observed  by  the  second  process[10]. 

4.  A  channel  is  covert  if  it  is  neither  designed  nor  intended  to  transfer  information 
at  all  [28). 

5.  Covert  channels  are  defined  as  those  channels  that  are  a  result  of  resource 
allocation  policies  and  resource  management  implementation[21]. 

6.  Covert  Channels  are  those  that  use  entities  not  normally  viewed  as  data  objects 
to  transfer  information  from  one  subject  to  another[26] 

7.  Given  a  nondiscretionary  security  policy  model  M  and  its  implementation  I(M) 
in  an  operating  system,  any  potential  communication  between  two  subjects 
l(Sh)  and  I(S,)  of  I(M)  is  covert  if  and  only  if  any  communication  between  the 
corresponding  subjects  Sj,  and  S,  of  the  model  M  is  illegal  in  M[52]. 

Each  of  the  above  definitions  have  been  used  successfully  to  identify  and  contain, 
if  not  eliminate,  covert  channels  in  various  security  designs.  However,  each  of  above 
definition  deals  with  storage  and  timing  channels  and  do  not  address  network  covert 
channels. 

We  refer  the  interested  reader  to  Huskamp[21]  for  a  detailed  discussion  on  covert 
timing  channels,  and  to  Tsai[52]  for  a  discussion  on  covert  storage  channels  and  to 
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NCSC[38]  for  a  summary  of  techniques  to  identify  covert  channels  and  estimate  their 
capacity,  and  a  discussion  of  the  various  covert  channel  handling  mechanisms.  The 
covert  channel  guideline[38]  also  presents  TCSEC  requirements  for  covert  channel 
analysis  and  includes  additional  recommendations  corresponding  to  B2-A1  evaluation 
classes. 

In  Girling[14],  various  covert  channels  in  LANs  are  identified  and  their  behavior 
discussed  in  the  context  of  the  ISO  OSI  framework.  Covert  channels  due  to  address 
field  encoding,  length  of  data  block  and  time  between  successive  transmissions  were 
discussed  and  solutions  proposed.  Experiments  conducted  to  estimate  the  covert 
channel  capacity  concluded  that  high  capacity  covert  channels  could  exist  in  high 
bandwidth  networks.  The  author  concludes  that  physical  security  may  be  more 
effective  against  covert  channels  than  the  use  of  encryption  or  complex  mechanisms 
to  reduce  their  capacity. 

We  partially  disagree  with  the  Girling's  conclusions[14];  state-of-the-art  com- 
puter and  communication  systems  require  more  protection  than  that  offered  by  phys- 
ical security.  With  rapid  advances  in  the  understanding  of  such  complex  systems  and 
the  easy  availability  of  sophisticated  technology,  we  believe  that  the  scope  and  nature 
of  attacks  mounted  on  a  secure  system  has  undergone  a  drastic  change  for  the  worse. 
Effective  countermeasures  are  required  to  ensure  operational  security  in  real-time 
without  an  undue  penalty  on  system  performance.  Security  loopholes  that  were  too 
difficult  or  intractable  to  exploit  until  a  few  years  ago  are  now  easy  prey  to  computer 
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scientists  and  hackers  alike.  Our  model  for  prevention  of  traffic  analysis  is  a  step  in 
the  direction  to  provide  end-to-end  communication  security. 

In  spite  of  the  directions  in  Voydock[58]  and  Girling[14],  there  exists  no  accepted 
definition  of  network  covert  channels.  We  propose  the  following  definition  of  network 
covert  channels. 

Definition:  Covert  channels  in  communication  subsystems  that  implement  a 
valid  interpretation  of  a  consistent  security  policy,  are  based  on  the  observation 
of  the  extrinsic  characteristics  of  the  communication  without  necessarily  having 
access  to  the  information  contained  within  messages  (due  to  encryption)  or  the 
necessity  to  modulate  internal  states  or  variables. 

Using  the  above  definition  of  network  covert  channels,  in  chapter  4,  we  have 
identified  certain  covert  channels  due  to  temporal  variation  in  traffic  characteristics. 
Having  given  several  definitions  of  covert  channels  and  the  definition  of  network 
covert  channels,  we  discuss  various  covert  channel  capacity  estimation  techniques  in 
the  following  section. 

2.4.2     Covert  Channel  Capacity  Estimation  Methods 

Information- Theory-Based  Method  for  Channel  Capacity  Estimation 

The  capacity  C  of  a  noiseless  discrete  channel  with  symbols  of  different  length 
is  given  by  Shannon[49]  as 
C  =  linw,(logj  N(t))/t 
where  N(t)  is  the  number  of  possible  symbol  sequences  of  time  t. 
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Suppose  all  sequences  of  the  symbols  5i ,  52,  •  •  • ,  S'„  are  allowed  and  these  sym- 
bols have  durations  ti,t%, •  •  •  ,f„.  If  N(t)  represents  the  number  of  sequences  of  du- 
ration t  we  have 

N(t)  =  N(t  - tt)  +  N(t  - 13)  +  ■  ••  +  JV(*  -  tn). 

The  total  number  is  equal  to  the  sum  of  the  numbers  of  sequences  starting  with 
Si,  S2,---,Sn  and  these  are  N(t-U),N(t  -ij),  ■  •  ■ ,  N(t-tn)  respectively.  According 
to  a  well  known  result  in  finite  differences,  N(t)  is  then  asymptotic  for  large  t  to  AXq 
where  A  is  constant  and  A'0  is  the  largest  real  solution  of  the  characteristic  equation: 

x-h  +  x->2  +  ■■■  +  X-'"  =  1 
and  therefore 
C  =  lirrw  !2«f£  =  log  X0. 

A  capacity  estimation  method  based  on  Shannon's  information  theory  is  pre- 
sented in  Millen[31].  In  this  method,  the  assumptions  are  that  the  covert  channels  are 
noiseless;  that  other  than  the  sender  and  receiver,  there  are  no  unconfined  processes 
in  the  system  during  channel  operation;  and  the  sender-receiver  synchronization  takes 
a  negligible  amount  of  time  [31,  38].  With  these  assumptions,  one  can  model  most 
covert  channels  as  finite  state  machines  and  compute  the  maximum  attainable  ca- 
pacity. 

Informal  Method  for  Estimating  Covert  Channel  Capacity 

A  Markov  model  to  compute  the  capacity  of  different  covert  channels  is  presented 
in  Tsai[53].  Results  from  this  study  indicated  that  "the  noise  factor  is  comparatively 
insignificant  for  capacity  degradation"  [52],  Ignoring  the  noise  caused  by  unconfined 
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processes,  a  simple  formula  for  computing  maximum  capacity  is  given  in  NCSC[38], 

which  we  reproduce  below: 

B((l)  =  b(Tr  +  Ts  +  2Te,)-1 

In  this  formula,  6  represents  the  encoding  factor  (usually  assumed  to  be  1)  and 
T  _Tn     rr(.-)+r.„,(Q 

T V*"       Tf{\) 

i» i-**\       n 

where  n  is  the  number  of  total  possible  transitions.  Ts(i)  and  TT(i)  are  the  times 
necessary  to  set  and  read  a  0  or  a  1  after  having  transmitted  a  0  or  a  1.  Thus, 
n  =  4.  Tenv(i)  is  the  time  to  set  up  the  environment  to  read  a  0  or  a  1.  Note  that 
in  these  formulas  it  is  assumed  that  all  environment  setup  for  both  variable  reading 
and  setting  is  done  by  the  receiving  processes.  It  is  also  assumed  that  the  setting  of 
Os  and  Is  take  the  same  amount  of  time  and  that  all  transmissions  contain  an  equal 
distribution  of  Os  and  ls[38]. 

Though  the  formal  method  presented  by  Millen[31]  yields  higher  capacity  than 
that  computed  by  the  informal  method  presented  by  Tsai[53],  NCSC[38]  concludes 
that  Millen's  method  is  superior  because  unlike  the  informal  method,  the  formal 
method  requires  the  analyst  to  set  up  a  realistic  scenario  for  covert  channel  use. 

Mode  Based  Security 

Mode  security  was  proposed  by  Browne[7]  as  a  "quantitative"  theory  to  contain 
the  information  flow  in  known  covert  channels.  In  theory,  the  surest  way  to  eliminate 
covert  channels  is  to  forbid  any  resource  sharing  between  subjects  at  different  secu- 
rity levels  in  a  multilevel  secure  system.    This  can  be  achieved  by  pre-partitioning 
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resources  between  various  security  levels  and  building  fire-walls  around  each  parti- 
tion. However,  such  partitioning  leads  to  sub-optimal  resource  utilization  and  may 
adversely  affect  the  availability  and  reliability  of  the  system. 

In  a  mode  secure  system,  instead  of  statically  partitioning  resources  among 
several  security  levels,  we  partition  the  resources  dynamically  "for  a  limited  period 
of  time,"  based  on  factors  such  as  resource  request,  utilization,  etc.  During  the  time 
interval  when  the  resource  partitioning  remains  fixed,  the  machine  is  said  to  be  in  a 
"mode"  [7].  Each  mode  of  the  machine  behaves  as  a  lattice  separable  system  which 
implies  that  there  is  no  interaction  between  different  security  levels  in  the  system. 
Therefore  when  the  machine  is  operating  in  a  mode,  there  can  be  no  covert  channels. 
Based  on  resource  requirements,  the  system  periodically  re-partitions  the  resources 
among  security  levels.  This  re-partitioning  of  resources  may  lead  to  covert  channels. 

As  Browne[7]  pointed  out,  the  capacity  of  the  channel  can  be  contained  by 
limiting  the  frequency  of  such  re-partitioning  and  the  number  of  modes  to  which  the 
system  can  transition  at  each  mode  change.  In  summary,  we  see  that  mode  security 
allows  us  to  achieve  better  resource  utilization  at  the  expense  of  allowing  certain 
covert  channels. 

We  use  Millen's  information  theory  based  capacity  estimation  technique  along 
with  Browne's  mode  security  system  model  to  compute  the  maximum  capacity  of 
network  covert  channels  that  exist  due  to  temporal  variation  in  traffic  characteristics. 
In  chapter  5,  we  describe  our  model  using  adaptive  scheduling  policy  and  use  results 
from  information  theory  to  compute  channel  capacity. 
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While  it  may  be  very  expensive  to  eliminate  all  covert  channels,  we  can  use  some 
handling  policies  to  reduce  the  covert  channel  capacity.  We  discuss  such  handling 
policies  in  the  following  section. 

'_'.~i      ( 'overt   ( 'lliilllicl   Handling 

We  discuss  three  related  techniques  to  handle  known  covert  channels: 

1.  Elimination  of  Covert  Channels: 

Covert  channels  can  be  eliminated  either  by  forbidding  any  resource  sharing  be- 
tween subjects  at  different  levels  in  a  multi-level  secure  system  or  by  eliminating 
features  and  mechanisms  in  the  operation  of  the  system  that  could  potentially 
cause  covert  channels. 

This  approach  eliminates  covert  channels  and  guarantees  a  covert  channel  free 
system  but,  in  general,  is  very  restrictive  and  expensive  to  enforce.  Any  restric- 
tion on  resource  sharing  will  most  likely  be  at  the  expense  of  system  performance 
and  resource  utilization.  Also  redesigning  user  interface  to  be  less  user  friendly 
and  cryptic  is  not  a  very  appealing  solution. 

2.  Capacity  Limitation: 

The  capacity  of  known  covert  channels  can  be  reduced  by  the  introduction 
of  noise  or  delays  into  channel  operation.  The  objective  of  such  methods  is 
to  reduce  the  maximum  channel  capacity  to  a  level  where  the  cost  of  further 
reduction  or  elimination  of  these  covert  channels  is  prohibitive. 
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The  problems  in  this  approach  is  in  being  able  to  correctly  quantify  noise  and  in 
being  able  to  correctly  introduce  delays  in  the  appropriate  TCB  primitives  and 
other  scheduling  operations.  An  alternative  is  to  use  "fuzzy  time"  to  introduce 
noise  in  the  system[19]. 

While  the  introduction  of  delay  and  noise  tends  to  degrade  system  performance 
and  necessitates  restating  the  Quality  of  Service  (QOS)  guarantees,  the  penalty 
imposed  on  system  performance  may  be  justified  given  the  objective  of  contain- 
ing covert  channel  capacity. 

3.  Auditing  the  Use  of  Covert  Channel: 

In  this  approach,  the  existence  of  covert  channels  is  known  to  and  possibly 
exploited  by  the  users  of  a  secure  system.  However,  auditing  the  usage  of  such 
channels  acts  as  a  deterrence  to  potential  users  of  the  channel.  The  difficulty  in 
this  approach  is  in  being  able  to  unambiguously  distinguish  between  innocuous 
user  activity  and  actual  covert  channel  usage.  A  related  problem  is  to  be  able 
to  detect  every  such  usage  of  the  covert  channel. 

Audit  trails  can  be  used  to  determine  the  use  of  covert  channels  and  its  capacity. 
However,  determining  which  specific  events  that  need  to  be  monitored  and 
recorded  by  audit  mechanisms  to  ensure  that  all  covert  channel  usage  is  detected 
is  a  nontrivial  task. 

Using  the  trace  data  from  our  measurements  on  UFNET  as  input  to  our  model, 
in  chapter  6,  we  determine  the  auditability  of  the  network  covert  channels  due  to 
variation  in  traffic  characteristics. 
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2.6     Conclusion 

Prevention  of  traffic  analysis  is  an  important  problem  with  no  acceptable  solu- 
tion proposed  as  yet.  This  thesis  addresses  the  deficiency  and  proposes  a  high  level 
model  for  the  prevention  of  traffic  analysis.  Considerable  work  has  been  done  on  esti- 
mating the  capacity  of  storage  and  timing  covert  channels  for  computation  resources 
and  we  extend  this  to  analyze  network  covert  channels.  Using  traffic  characteristics 
from  measurements  on  an  actual  network,  we  analyze  the  performance  of  the  model 
and  the  auditability  of  existing  covert  channels.  In  the  next  chapter,  we  introduce 
our  basic  model  for  the  prevention  of  traffic  analysis. 


CHAPTER  3 
MODEL  FOR  THE  PREVENTION  OF  TRAFFIC  ANALYSIS 


In  this  chapter,  we  propose  a  high  level  model  for  prevention  of  traffic  analysis 
in  communication  networks  and  suggest  an  approach  for  prevention  of  unauthorized 
release  of  information  concerning  traffic  patterns.  The  model  assumes  that  an  eaves- 
dropper may  read  the  contents  of  all  links,  including  the  source  and  destination,  and 
that  all  countermeasures  are  performed  at  the  transport  layer  of  the  ISO  OSI  network 
model[23].  Countermeasures  performed  at  the  transport  level  include  encryption,  a 
limited  form  of  message  rerouting,  delaying  messages,  and  sending  dummy  messages 
as  needed,  within  resource  capacities.  The  goal  of  the  countermeasures  is  to  prevent 
the  eavesdropper  from  gaining  any  useful  information  regarding  the  traffic  patterns  in 
a  cost  efficient  and  feasible  manner.  By  formulating  the  problem  in  terms  of  systems 
of  equalities  and  systems  of  inequalities,  linear  programming  methods  may  be  used 
to  find  solutions  to  the  traffic  analysis  security  problem. 

3.1     The  Basic  Model  for  Spatial  Neutrality 

A  model  of  a  network  security  system  should  include  the  resources  to  be  pro- 
tected, the  resources  available  for  protecting  them,  the  nature  of  the  threat,  and  cost 
measures  for  evaluating  the  means  of  implementing  protection.  For  the  purposes  of 
this  discussion,  we  will  assume  that  there  are  n  nodes;  the  traffic  patterns  are  fixed 
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temporally;  encryption  is  performed  at  the  transport  layer;  and  the  intruder  may 
read  the  contents  of  every  link.  For  cost  measures,  we  consider  delay,  processing 
costs  and  increased  traffic  in  the  network.  The  total  traffic  increase  will  be  the  sum 
of  all  additional  packets  sent  over  all  links.  Delay  will  be  measured  by  the  additional 
hops  a  message  must  make  to  reach  its  destination  (queuing  and  processing  delays 
will  be  ignored).  Processing  costs  will  be  measured  in  terms  of  the  number  of  addi- 
tional packets  the  hosts  must  process  (decode  and  discard  or  resend).  Between  each 
pair  of  hosts  in  the  system,  there  is  an  amount  of  communication  necessary  to  their 
operation  that  may  be  described  by  a  traffic  matrix.  The  goal  of  the  intruder  is  to 
determine  this  traffic  matrix,  while  the  goal  of  the  system  is  to  prevent  the  release  of 
this  information.  More  specifically,  the  goal  of  the  system  will  be  to  present  to  the 
intruder  a  neutral  traffic  matrix,  which  we  now  define. 

Definition:  A  neutral  traffic  matrix  is  a  traffic  matrix  where 
'  a  iff  i  ^  j 

mi  j\ 


{a  lit  i  f  j 
0  iff  i  =  j 


If  traffic  is  altered  by  rerouting  and  padding  so  that  the  intruder  observes  a  neu- 
tral traffic  matrix  regardless  of  the  original  traffic  pattern,  the  intruder  cannot  derive 
useful  information  regarding  the  original  traffic  patterns.  We  refer  to  such  a  traffic 
matrix  as  a  spatially  neutral  traffic  matrix  and  the  condition  as  spatial  neutrality. 

For  the  system  to  achieve  its  goal,  the  following  methods  are  available 

•  send  dummy  packets  to  the  hosts  with  whom  the  regular  traffic  volume  is  low; 

•  reroute  packets  via  one  or  more  intermediate  nodes; 

•  delay  packets. 
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Dummy  packets  allow  a  lightly  loaded  link  to  have  its  apparent  load  increased 
at  the  cost  of  introducing  additional  traffic  in  the  network.  In  the  completely  con- 
nected network,  one  additional  traffic  unit  is  generated  by  each  dummy  packet.  The 
destination  host  must  accept  the  packet,  decode  it,  and  discard  it,  if  it  is  found  to  be 
a  dummy,  so  one  additional  processing  unit  is  charged  per  dummy  packet. 

Rerouting  allows  the  load  on  some  links  to  be  decreased,  smoothing  the  differ- 
ences in  traffic  volume  across  all  links.  This  protocol  requires  a  host  to  accept  any 
packet  that  lists  it  as  its  apparent  destination,  decode  the  packet,  and,  if  necessary, 
resend  it  to  its  true  destination.  Since  we  assume  that  the  network  is  completely 
connected,  rerouting  introduces  one  additional  unit  of  load  per  packet  rerouted.  In 
addition,  it  introduces  one  unit  of  delay  (one  additional  hop)  and  requires  one  addi- 
tional unit  of  processing  (at  the  intermediate  host). 

Delaying  packets  requires  that  there  be  sufficient  memory  available  at  the  host 
and  that  the  delays  incurred  are  tolerable.  This  can  smooth  out  small  temporal  vari- 
ation in  the  traffic  patterns,  but  cannot  change  the  traffic  pattern  over  a  sufficiently 
long  term. 

The  cost  in  increased  load  of  using  these  approaches  may  be  determined  by 
summing  the  dummy  and  rerouted  packets,  or  it  may  be  determined  directly  from 
the  original  traffic  matrix  M  and  the  apparent  traffic  matrix  A  produced  by  these 
camouflaging  measures. 
Let  S  be  the  aggregate  load  in  the  original  traffic  matrix: 
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Dummy  Packets 

Neutral  TM 

Cost  •  30 

Cost  =  60  -  30  =  30 

(b) 

(c) 

Original  TM 

Cost  •  30 

(») 

Figure  3.1.  (a)-(c)  Cost  of  Neutral  Matrix  by  Padding  Only 
and  let  T  be  the  aggregate  load  in  the  apparent  traffic  matrix: 

t  =  ES.i  £?=,  MiJ]- 

Then  Load  Cost  =  T-  S. 

3.2     A  Lower  Bound  on  Load  Cost 

It  has  been  suggested  previously  that  dummy  messages  may  be  used  to  mask  the 
true  traffic  matrix  [57],  but  it  may  be  very  costly  to  produce  an  apparently  neutral 
traffic  matrix  in  this  manner  (see  figure  3.1).  In  order  to  achieve  the  goal  of  a  neutral 
traffic  matrix,  we  reroute  some  of  the  traffic  from  a  given  source-destination  host 
pair  via  intermediate  hosts  (see  figure  3.2).  In  order  to  achieve  neutrality,  it  may  be 
expedient  to  generate  dummy  messages  that  pad  the  traffic  between  a  given  source- 
destination  host  pair.  However,  this  is  only  done  after  rerouting  has  decreased  the 
maximum  traffic  over  all  links. 

Figure  3.1  and  figure  3.2  graphically  represent  the  cost  of  achieving  a  neutral 
traffic  matrix  using  dummy  packets  and  rerouting  respectively.  This  is  shown  in 
matrix  notation  in  figure  3.3  and  figure  3.4.  For  the  example  in  figure  3.1,  the  load 
cost  of  using  dummy  packets  alone  is  30,  while  a  combination  of  rerouting  and  dummy 
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Figure  3.2.  (a)-(e)  Cost  of  Neutral  Matrix  by  Rerouting  and  Padding 
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Figure  3.3.  (a)-(c)  Neutral  Traffic  Matrix  by  Padding  Only. 
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Figure  3.4.  (a)-(d)  Neutral  Traffic  Matrix  by  Rerouting  and  Padding. 
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packets  produces  a  load  cost  of  only  12.  In  fact,  this  is  the  minimum  load  cost  that 

can  be  achieved  for  this  example,  as  shown  in  the  following  proposition. 

Proposition  1. 

For  a  given  traffic  matrix  M,  the  minimum  load  cost,  MLC  is  given  by 

MLC  =  n(n  -l)fi-S, 

where 

S  =  sr=i  £J=1  m[m] 

is  the  aggregate  traffic  in  the  original  traffic  matrix  and, 

_  """(EL,  ^M.EL  M[j,i]  i  j=i,2 ~} 

P  ~~  n-1 

is  the  maximum  average  traffic  into  or  out  of  a  node  in  the  original  traffic  matrix. 

Proof:  The  node  v  that  has  the  maximum  amount  of  inbound  or  outbound 
traffic  must  still  have  that  total  amount  of  traffic  inbound  or  outbound  in  any  ap- 
parent traffic  matrix  that  satisfies  the  original  matrix,  regardless  of  how  the  traffic 
is  distributed  over  its  links.  For  an  apparent  traffic  matrix  to  be  neutral,  the  traffic 
on  all  its  links  must  be  equal.  This  traffic  will  be  determined  by  the  link  with  the 
heaviest  load  after  balancing.  This  is  the  lower  bound  for  v  which  is  also  a  lower 
bound  for  the  network  as  a  whole. 

A  lower  bound  exists  for  the  least  cost  neutral  traffic  matrix  that  has  a  feasible 
solution  iiN0.  Unfortunately,  this  may  not  be  realizable,  in  the  sense  that  there  are 
initial  traffic  matrices  for  which  the  total  load  is  fi(n2  —  n),  but  that  are  not  neutral. 
Since  rerouting  always  causes  an  increase  in  the  total  load,  rerouting  to  create  a 
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Figure  3.5.  Example  of  TM  for  Which  /iJV0  Is  Not  Realizable 
neutral  traffic  matrix  must  increase  the  total  load  above  the  given  lower  bound.  An 
example  of  a  traffic  pattern  that  has  these  properties  is  given  in  figure  3.5. 

3.3  The  Traffic  Matrix  Operations  Approach 
The  proposed  approach  uses  rerouting  and  padding  to  minimize  the  load  cost  of 
the  observed  neutral  traffic  matrix.  Rerouting  causes  non-local  effects  and  therefore 
cannot  be  done  for  one  source-destination  pair  independently  of  the  rest.  When 
traffic  from  node  a  to  node  b  is  rerouted  via  node  c  in  an  attempt  to  balance  node 
a's  output,  the  traffic  from  node  c  to  node  6  is  increased.  Since  all  the  rerouting 
decisions  must  be  considered  simultaneously,  we  formulate  these  as  a  system  of  linear 
inequalities.  A  solution,  if  one  exists,  to  this  system  may  provide  a  prescription  for 
rerouting  traffic  to  achieve  a  maximum  traffic  matrix  element  no  greater  than  the 
selected  threshold.  If  the  solution  has  any  negative  rerouting  quantities,  then  it  is 
deemed  infeasible;  otherwise  a  solution  is  feasible.  The  overall  approach  is  then  to 
find  the  smallest  neutral  traffic  matrix  for  which  a  feasible  solution  exists. 
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The  problem  is  most  naturally  stated  in  terms  of  matrices,  yielding  a  system  of 
linear  equations,  the  solution  of  which  gives  the  rerouting  information.  Some  prelim- 
inary definitions  and  symbols  are  needed  before  proceeding. 

•  M  will  represent  the  initial  traffic  matrix  given  as  input. 

•  T  will  represent  the  target  traffic  matrix,  X  €  3. 

•  Operator  /  (for  flatten),  when  applied  to  an  n  x  n  matrix  B,  creates  an  n2  X  1 
column  vector  Bj  consisting  of  the  elements  of  B  arranged  in  row-major  order. 

As  examples,  if  t  is  the  matrix  transpose  operator, 

{B')j  is  an  (n2  x  1)  column  vector,  arranged  in  column-major  order,  and 
(Bj)'  is  an  (1  x  n2)  row  vector,  arranged  in  row-major  order. 

•  ra,b,c  will  represent  the  number  of  packets  rerouted  from  source  a  via  intermediate 
node  6  to  destination  c,  termed  the  reroute  quantity  for  a,  6,  c. 

•  flMa,i,c,  the  reroute  matrix  corresponding  to  ra,t,c,  is  an  n  x  n  matrix  that  represents 

the  change  in  the  apparent  traffic  matrix  caused  by  rerouting  one  packet  from  source 

a  via  intermediate  node  6  to  destination  c. 

t  iff  («  =  a  A  j  =  6)  V  («  =  6  A  j  =  c) 

RMa,bAhJ]  —  '  -1  iff  *  —  o  A  j  =  c 

.   0  otherwise 

The  reroute  matrix  represents  the  fact  that  the  a-b  traffic  and  b-c  traffic  increases 

and  the  a-c  traffic  decreases  as  a  result  of  rerouting  the  a-c  traffic  via  b.   Note  that 

the  reroute  matrices  /JM0?a,a,  RMaA,c,  and  RMa,bib  ,  etc.,  have  all  zero  elements  (they 

represent  either  self-communication  or  rerouting  via  either  the  source  or  destination 

node  themselves).  There  are  n3  each  of  the  reroute  quantities  and  their  corresponding 
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DM: 


Figure  3.6.  The  Difference  Matrix,  DM 

reroute  matrices. 

•  R  represents  all  the  rerouting  quantities;  it  is  an  n3  x  1  column  vector  of  all  ra,b,c  in 
lexicographic  order.  This  is  the  information  sought  for  the  first  step  of  the  proposed 
method. 

•  The  change  in  the  traffic  due  to  padding  by  dummy  packets  is  represented  by  the 
n  x  n  non-negative  matrix  P. 

•  The  difference  matrix,  DM,  represents  the  rerouting  effects  for  all  possible  rerouting 
quantities.  It  is  an  n2  x  n3  matrix  with  flattened  rerouting  effect  matrices  as  its 
columns,  arranged  in  lexicographic  order. 

DM[i,j]  =  RM„„,,k,['",n 
where 

i  =  n(i"-l)+j",  and 
j  =  n(n(i'  -  1)  +  }'  -  1)  +  V,  Vt'.Vi',    Vfc'  €  [l..n],     and  W,  Vj",  6  [l..n] 
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The  main  algorithm  accepts  as  input  the  original  traffic  matrix,  and  tries  to 
generate  a  traffic  matrix  such  that  the  volume  of  traffic  between  all  given  source- 
destination  pairs  is  equal,  and  is  the  minimum  that  will  support  the  necessary  true 
traffic.  The  subroutine  we  are  using  accepts  the  original  traffic  matrix  and  a  target 
traffic  matrix  as  inputs,  then  seeks  a  feasible  rerouting  so  that  the  apparent  traffic 
matrix  is  dominated  by  the  target  traffic  matrix.  Our  goal  is  to  determine  the  vector 
of  reroute  quantities  R.  The  apparent  traffic  matrix  after  rerouting  may  then  easily 
be  padded  with  dummy  messages  in  order  to  make  the  final  apparent  traffic  matrix 
neutral.  Although  the  target  traffic  matrices  considered  here  will  all  be  neutral,  this 
is  not  a  requirement  of  the  subroutine  nor  of  padding. 

Proposition  2. 
For  a  given  traffic  matrix  M,  and  a  known  target  traffic  matrix  T,  the  rerouting 
information  can  be  found  if  there  is  a  positive  solution  of  R,  which  satisfies 
DM  x  R<T;-M} 

Proof:  The  above  equations  represent  the  fact  that  the  final  traffic  matrix  T 
will  be  the  sum  of  the  given  traffic  matrix  M,  the  rerouted  traffic  matrix  (determined 
from  DM  and  R),  and  the  padding  matrix  P.  The  rerouting  information  is  obtained 
by  solving  the  system  of  linear  inequalities  simultaneously. 

A  feasible  solution  is  a  one  in  which  the  final  traffic  matrix  is  neutral  and  the 
reroute  and  padding  quantities  are  positive  and  integer  valued.  A  solution  to  the 
system  of  linear  inequalities  can  be  found,  if  one  exists,  using  any  of  the  standard 
techniques  to  solve  n  linear  inequalities  in  n  variables.  However,  such  a  solution  need 
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not  be  unique  [15].  It  is  possible  to  include  the  feasibility  condition  as  an  additional 
constraint  to  the  system  of  linear  equations  obtained  from  Proposition  2  and  then  try 
to  obtain  the  solution  vector  R.  If  a  feasible  solution  is  found  using  the  R  obtained, 
we  will  be  able  to  reroute  some  traffic  and  add  dummy  packets  to  create  a  neutral 
apparent  traffic  matrix.  If  no  feasible  solutions  are  found,  then  a  larger  neutral  target 
matrix  is  used.  Using  the  neutral  traffic  matrix  obtained  from  rerouting  alone  as  the 
lower  bound  and  that  obtained  from  padding  alone  as  the  upper  bound,  a  form  of 
binary  search  may  be  used  to  speed  up  the  location  of  a  low  cost,  realizable  target 
matrix. 

The  main  algorithm  attempts  to  find  a  low  cost  combination  of  rerouting  and 
padding  using  linear  programming  to  seek  the  solutions  to  the  systems  of  inequalities 
it  generates.  The  procedure  must  terminate  because  the  neutral  matrix  formed  by 
padding  alone  without  rerouting  has  a  feasible  R. 

An  integer  linear  program  formulation  of  the  problem  is  given  in  section  7.4. 
In  section  7.4.1,  the  performance  of  an  implementation  of  the  linear  programming 
formulation  is  compared  with  the  simulation  results  of  algorithm  to  obtain  a  spatially 
neutral  traffic  matrix. 

3.4     Relevance  of  Model 

Prevention  of  traffic  analysis  can  be  achieved  by  several  techniques  which  in- 
clude link  encryption,  network  level  encryption  and  end-to-end  security  measures. 
Link  level  encryption  offers  protection  against  direct  traffic  analysis  but  requires  that 
every  node  in  the  network  be  secure.    If  a  single  node  is  compromised,  then  high 
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level  entities  involved  in  the  communication  are  revealed.  Link  level  encryption  also 
require  all  nodes  in  the  network  to  participate  in  the  security  policy,  regardless  of  the 
security  threat  perceived  by  the  node.  This  implies  that  even  the  nodes,  and  therefore 
users,  that  do  not  require  secure  communication  will  pay  the  cost  of  and  incur  the 
overhead  due  to  the  security  policy.  We  consider  the  above  to  be  serious  limitations 
in  link  level  encryption  policies  for  providing  secure  communication.  Network  level 
encryption  policies  offer  comparable  protection  against  traffic  analysis  but  requires 
the  source  and  destination  addresses  to  be  encrypted.  Due  to  the  encrypted  address, 
each  packet  must  be  broadcast  to  every  node  in  the  network.  Since  broadcasting  each 
packet  is  an  expensive  proposition  which  could  degrade  system  performance  due  to 
congestion  and  other  factors,  network  level  encryption  is  rarely  used  for  the  preven- 
tion of  traffic  analysis.  End-to-end  security  measures,  such  as  the  one  proposed  in 
this  chapter,  have  the  advantage  of  being  flexible  enough  to  model  the  users  specific 
security  requirements.  Also  it  is  not  necessary  for  all  nodes  in  the  network  to  par- 
ticipate in  the  security  policy;  only  the  nodes  that  desire  protection  against  traffic 
analysis  can  participate  in  the  security  model  and  only  the  participating  nodes  pay 
for  the  security  costs.  We  believe  that  the  ISO  OSI  transport  layer  to  be  ideal  to 
implement  such  security  policies. 

In  this  chapter  we  present  an  end-to-end  policy  for  the  prevention  of  traffic 
analysis.  We  defined  methods  used  for  achieving  this  goal  and  gave  cost  measures 
for  comparing  alternative  solutions.  A  lower  bound  on  the  load  cost  for  preventing 
traffic  analysis  under  this  model  was  proven  and  we  provided  examples  for  which  this 
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bound  could  and  could  not  be  achieved.  Finally,  we  suggested  a  linear  programming 
approach  to  finding  low  cost  means  of  preventing  traffic  analysis  using  rerouting  and 
dummy  packets.  This  method  is  guaranteed  to  work  and  will  impose  no  greater  load 
cost  than  using  dummy  packets  alone. 

A  discussion  of  the  issues  involved  with  incorporating  this  model  with  trusted 
computing  bases  (TCBs),  and  network  TCBs  (NTCBs)  is  relevant.  The  approach 
outlined  here  can  be  a  component  of  a  system  providing  class  B2  level  protection. 
In  class  B2  systems,  the  TCB  is  responsible  for  discretionary  and  mandatory  access 
control  enforcement.  Class  B2  systems  also  attempt  to  reduce,  if  not  eliminate,  covert 
channels[10,  37]. 

There  are  two  ways  in  which  the  presented  approach  could  be  useful: 

1.  install  this  model  at  the  host  in  the  transport  level; 

2.  install  this  model  in  IMPs  as  an  end-to-end  network  layer,  on  top  of  the  usual 
network  layer,  similar  to  the  internet  layer  in  the  DoD  protocol  suite. 

The  first  alternative  allows  a  cooperating  set  of  hosts  to  use  untrusted  networks 
yet  still  prevent  traffic  analysis  and  information  leaks.  There  is  no  provision  for 
active  attacks  without  cooperation  from  the  network  itself.  The  second  approach 
could  place  a  sort  of  end-to-end  network  layer  in  the  IMPs  themselves,  external  to 
the  hosts,  and  make  this  layer  part  of  the  NTCB.  As  part  of  the  NTCB,  the  commu- 
nications concerning  traffic  levels,  the  pad  messages,  and  the  rerouted  messages  are 
not  accessible  to  the  hosts,  so  untrusted  hosts  may  be  tolerated.  An  eavesdropper  is 
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prevented  from  gaining  any  information  about  rerouting  or  traffic  patterns.  This  al- 
lows the  NTCB  to  handle  both  active  and  passive  attacks,  with  network  layer  policy 
routing  dealing  with  the  active  attacks  and  the  end-to-end  network  layer  rerouting 
(given  here)  dealing  with  passive  attacks.  In  neither  option  are  data  contents  for 
third  party  hosts  divulged  when  the  layer  performing  the  rerouting  is  compromised 
unless  the  decryption  key  for  the  third  party  host  has  also  been  compromised. 

In  summary,  this  model  represents  a  step  in  the  direction  of  mathematically 
modeling  the  problem  of  preventing  traffic  analysis.  We  give  an  algorithmic  approach 
based  on  linear  programming  to  find  solutions  to  the  rerouting  problem,  and  the  cost 
measures  proposed  may  be  used  as  a  basis  of  comparison  for  our  methods  and  those 
yet  to  be  developed. 

In  the  next  chapter,  we  extend  our  model  to  address  temporal  variation  in  traffic 
characteristics.  Two  transmission  scheduling  policies  that  eliminate  or  reduce  covert 
channels  due  to  temporal  variation  are  also  presented.  A  discussion  of  the  tradeoffs 
between  the  adaptability  of  the  transmission  schedule  and  the  existence  of  covert 
channels  is  also  included. 


CHAPTER  4 
EXTENSIONS  TO  ADDRESS  TEMPORAL  VARIATION  IN  TRAFFIC 


4.1     Introduction 

In  this  chapter,  we  propose  scheduling  strategies  that  operate  at  the  transport 
layer  to  generate  transmission  schedules  that  prevent  traffic  analysis  and  the  creation 
of  covert  channels  due  to  temporal  variation  in  the  transmission  of  packets.  In  addi- 
tion to  requiring  the  traffic  matrix  be  spatially  neutral,  we  require  the  transmission 
schedule  be  temporally  neutral  to  eliminate  potential  covert  channels.  The  static 
scheduling  policy  generates  temporally  neutral  transmission  schedules.  We  extend 
the  static  scheduling  policy  to  develop  an  adaptive  scheduling  policy  that  can  adapt 
to  long  term  load  fluctuations.  The  adaptive  algorithm  is  expensive  and  there  exists 
the  possibility  of  a  low  bandwidth  and  noisy  covert  channel;  we  suggest  mechanisms 
to  reduce  the  bandwidth  of  the  covert  channel.  The  tradeoff  is  between  the  adapt- 
ability of  the  scheduling  policy  and  the  bandwidth  of  the  covert  channel. 

The  model  presented  in  chapter  3  to  prevent  traffic  analysis  addresses  spatial 
neutrality.  This  implies  that  the  intruder  cannot  derive  any  useful  information  by 
observing  the  traffic  on  the  network  as  the  volume  and  nature  (packet  size  and  type)  of 
traffic  between  any  source-destination  pair  is  identical  to  that  between  any  other  pair. 
However,  we  are  concerned  with  the  temporal  variation  in  traffic  and  the  possible 


15 


46 


introduction  of  covert  channels  due  to  the  variation  in  transmission  characteristics 
over  time.  Note  that  although  the  total  volume  of  traffic  exchanged  between  any 
pair  of  nodes  in  the  network  is  the  same  to  satisfy  the  spatial  neutrality  criterion,  the 
source  could  transmit  the  packets  in  a  burst  or  could  spread  out  its  transmissions  over 
a  period  of  time.  The  model  does  not  impose  any  restrictions  on  the  transmission 
schedule  and  therefore  a  knowledgeable  user  might  be  able  to  communicate  with  his 
accomplice  by  timing  the  transmissions,  thus  introducing  covert  channels. 

We  propose  a  mechanism  to  generate  a  transmission  schedule  that  will  satisfy 
our  primary  goal  of  prevention  of  traffic  analysis  and  the  prevention  of  covert  channels 
due  to  temporal  variation  in  packet  transmission  schedule.  As  in  the  model  for  spatial 
neutrality,  the  transmission  schedules  are  intended  to  operate  at  the  transport  layer 
of  the  OSI  ISO  model.  We  adopt  a  slotted  time  system  in  which  nodes  transmit 
fixed  size  packets  in  fixed  size  slots.  By  generating  temporally  neutral  (defined  in 
section  4.3)  transmission  schedules,  we  eliminate  certain  covert  channels.  We  propose 
a  static  scheduling  policy  in  which  the  transmission  schedule  is  fixed  and  the  nodes 
just  follow  the  transmission  schedule  at  all  times.  Since  this  transmission  schedule 
is  known  and  remains  the  same  over  a  period  of  time,  there  is  no  possibility  of  a 
covert  channel.  However,  such  a  scheduling  policy  is  not  responsive  to  changes  in  the 
load  and  can  degrade  system  utilization  and  performance  significantly.  We  therefore 
propose  an  adaptive  scheduling  policy  that  can  adapt  to  variations  in  the  load,  but 
which  leaves  open  the  possibility  of  a  covert  channel.  However,  the  covert  channel  has 
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very  low  bandwidth  and  is  noisy;  the  adaptiveness  of  the  mechanism  can  be  traded 
off  for  the  capacity  or  even  existence  of  the  covert  channel. 
4.2     Discussion  of  Slotted  Time 

We  have  an  n  x  n  (neutral)  traffic  matrix,  with  nodes  numbered  1  to  n,  repre- 
senting the  volume  of  traffic  between  pairs  of  nodes  over  some  period  of  time.  We 
assume  that  all  transmitted  packets  have  the  same  length  and  that  each  packet  re- 
quires one  time  unit  (called  a  slot)  for  transmission.  Associated  with  each  node  is 
a  virtual  queue  to  buffer  packets  before  they  can  be  transmitted.  During  a  given 
period  j,  new  packets  arrive  and  are  placed  in  the  buffer;  packets  arriving  in  the 
current  period  time  may  not  be  transmitted  until  the  next  period.  Any  packet  in  the 
buffer  at  the  beginning  of  the  period  is  eligible  for  transmission.  We  will  now  define 
some  important  terms.  See  figure  4.1. 

•  Slot:  This  is  the  basic  time  unit  during  which  a  given  node  may  send  or  receive 
at  most  one  packet.  We  assume  that  at  most  one  node  can  transmit  per  slot, 
so  n(n  —  1)  slots  are  needed  for  all  pairs  to  communicate. 

•  Period:  A  period  is  a  set  of  successive  slots  during  which  one  phase  of  the 
transmission  schedule  is  carried  out.  In  our  model,  a  period  consists  of  n(n  —  1) 
active  slots  and  m  idle  slots.  In  the  static  scheduling  policy,  m  is  a  constant; 
in  the  adaptive  scheduling  policy,  m  may  vary  over  time. 
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-  Cycle  i 


Period  (i+1) 


ACTIVE  SLOT 


□ 


Scheduled  Packets  (Original  +  Reroute) 
or  Dummy  Packets  Transmission  Time 

Idle  Time 


Figure  4.1.  Slotted  Time  System 

•  Cycle:  A  set  of  successive  periods  during  which  the  number  of  idle  slots,  m, 
does  not  vary  is  a  cycle.  Typically  the  transmission  of  the  entire  volume  of 
communication  as  prescribed  by  the  traffic  matrix  is  carried  out  in  one  cycle. 

We  will  now  define  the  arrival  of  a  packet,  the  buffering  of  packets  (due  to 
batched  arrivals)  and  the  transmission  (departure)  of  packets  during  a  slot.  Note 
that  here  we  are  accounting  for  independent  events;  we  are  not  concerned  with  the 
"transmission  schedule"  as  yet. 

•  The  arrival  of  packets  at  node  i. 

The  arrival  of  packets  at  a  given  node  can  be  accounted  for  as  follows: 

1.  The  packet  belongs  to  actual  traffic  (as  defined  by  the  traffic  matrix)  with 
source  t,  destined  for  node  j. 

2.  The  packet  from  source  node  k  destined  for  node  j  is  rerouted  via  an 
intermediate  node  i.  The  packet  has  arrived  at  node  i  from  node  k  in  this 
period  and  will  be  queued  for  transmission  to  node  j  in  the  next  period. 
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•  Departure  of  packets  from  node  i. 

In  each  period,  each  node  will  transmit  exactly  one  packet  to  every  other  node 
in  the  network.  A  packet  may  not  arrive  and  depart  in  the  same  period;  all 
packets  that  arrived  during  the  previous  period  are  eligible  for  transmission  and 
may  be  transmitted  if  scheduled. 

•  Backlog  of  packets  at  node  i. 

Packets  remaining  in  the  virtual  queue  at  the  end  of  each  period  are  known  as 
backlog  packets.  These  packets  are  eligible  for  transmission  in  the  next  period. 
Let  A(«)  represent  the  number  of  arrivals  in  period  i,  D(j)  the  departure  in 
period  i  and  B(s)  the  number  of  packets  in  the  virtual  queue  (backlog)  at  the 
end  of  period  i.  We  then  have  the  following  relation  which  gives  the  number  of 
backlog  packets: 
B(i)  =  B(:  -  1)  -  D(t  -  1)  +  A(»)  for  i  >  0. 

4J Characterizing  Temporal  Variation  in  Traffic 

We  have  alluded  to  the  creation  of  a  covert  channel  due  to  temporal  variation 
in  the  transmission  of  packets.  Clearly,  in  a  secure  network  we  would  like  to  prevent 
the  creation  of  any  covert  channels.  We  describe  two  related  techniques  to  generate 
transmission  schedules  such  that  the  eavesdropper  cannot  gain  any  useful  informa- 
tion by  observing  the  traffic  characteristics  on  the  network.  Below  we  define  certain 
criteria  for  transmission  schedules;  the  enforcement  of  each  of  these  criteria  guaran- 
tees a  temporally  neutral  transmission  schedule. 
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Figure  4.3.  Covert  Channel  due  to  Transmission  Order 
•  The  Volume  (V)  of  communication  between  a  given  pair  of  nodes. 

In  a  neutral  traffic  matrix,  the  volume  of  communication  between  each  pair  of  nodes 
over  the  observation  interval  is  the  same  and  the  model  for  spatial  neutrality  pre- 
sented in  the  previous  chapter,  does  not  require  this  information  to  be  secret.  By 
imposing  the  spatial  neutrality  criterion  on  the  original  traffic  matrix,  we  have  ef- 
fectively eliminated  the  volume  of  communication  between  any  pair  of  nodes  as  a 
contributing  factor  to  the  covert  channel. 

•  The  Frequency  (F)  of  communication  between  a  given  pair  of  nodes. 
In  figure  4.2,  we  see  that  information  can  be  encoded  by  timing  the  transmission  of 
packets.  In  this  case,  even  though  the  average  frequency  is  constant  due  to  spatial 
neutrality,  the  distribution  of  packets  within  the  observation  interval  creates  a  po- 
tential covert  channel.  By  requiring  that  each  node  exchange  a  packet  with  every 
other  node  in  the  network  each  period,  we  eliminate  the  use  of  this  characteristic  of 
transmission  to  create  a  covert  channel. 
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•  The  Order  (O)  of  communication  between  a  set  of  nodes. 

Even  if  we  insist  that  each  node  send  every  other  node  exactly  one  packet  per  period, 
the  order  of  communication  could  create  a  potential  covert  channel.  For  example,  if 
a  node  sends  a  packet  to  node  i  before  sending  a  packet  to  node  j  versus  sending 
a  packet  to  node  j  before  node  i,  then  information  can  be  encoded  in  the  order  of 
transmission.  For  example,  in  figure  4.3,  the  node  sends  a  packet  to  node  A  followed 
by  a  packet  to  node  B,  i.e.  in  order  AB,  to  encode  "1",  and  the  reverse  order  BA 
to  encode  "0".  If  the  intruder  and  his  accomplice(s)  can  affect  the  transmission 
order  in  k  nodes,  then  fc!  transmission  orders  are  possible,  providing  a  bandwidth 
of  as  much  as  log(fc!)  >  jlogfc  bits  per  period.  Similarly,  the  position  within  a 
period  of  transmission  to  a  particular  node  may  be  used  to  convey  information.  By 
requiring  that  each  node  communicate  with  every  other  node  in  a  predetermined  order 
during  all  periods,  the  order  and  position  of  communication  remains  the  same  thus 
preempting  the  use  of  this  characteristic  of  transmission  to  create  a  covert  channel. 

•  The  (extrinsic)  Nature  (N)  of  communication  between  a  set  of  nodes. 
Given  that  the  volume,  frequency  and  order  of  communication  to  be  the  same  (due 
to  the  V,  F  and  O  criteria  discussed  above),  the  extrinsic  nature  of  communication 
could  create  a  potential  covert  channel.  Assuming  that  the  packets  are  encrypted, 
the  intruder  cannot  see  the  contents  of  the  packets.  However  extrinsic  characteristic 
like  packet  size  can  be  used  to  exchange  information  covertly.  For  example,  a  user 
may  send  his  accomplice  a  packet  of  some  predetermined  size  followed  by  another 
packet  of  a  different  size  to  exchange  some  information  covertly.   We  can  eliminate 
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these  covert  channels  by  requiring  that  the  extrinsic  characteristic  of  all  packets  be 
the  same  by  enforcing  a  fixed  packet  size  and  encrypting  the  packets. 
•  The  Length  (L)  of  transmission. 

If  the  volume,  frequency,  order  and  nature  of  communication  among  nodes  are  the 
same  (due  to  the  V,  F,  0  and  N  criteria  discussed  above),  then  the  question  that 
arises  is:  for  how  long  do  these  transmission  characteristics  remain  unchanged  and 
if  they  do  change,  when  and  how  do  they  change?  For  example,  if  a  single  user  is 
able  to  change  any  of  the  above  parameters  just  by  performing  some  local  operations 
like  increasing  the  load  on  the  system  or  choosing  to  ignore  any  of  the  globally 
accepted  parameters,  then  he  can  easily  create  a  covert  channel  to  communicate 
with  his  accomplice.  To  eliminate  this  possibility,  we  should  ensure  that  the  globally 
selected  parameters  remain  the  same  for  the  observation  period.  However,  since  the 
system  may  have  to  respond  to  traffic  changes,  it  is  difficult  to  eliminate  this  channel. 
At  best,  we  want  that  a  single  user  is  incapable  of  unilaterally  changing  the  global 
parameters.  Any  changes  should  be  done  by  a  negotiation  process  involving  at  least  a 
majority  of  nodes,  if  not  all  nodes,  and  the  changes  should  be  effected  in  a  controlled 
manner.  By  building  the  communication  system  over  NTCBs  we  can  eliminate  many 
problems,  the  most  important  being  the  violation  of  any  aspect  of  the  transmission 
protocol[10,  61]. 

From  the  above  discussion  we  can  describe  a  transmission  schedule  by  the  five 
tuple  <  V,F,0,N,L  >.  Depending  on  the  information  that  can  be  encoded  in  any 
of  these  transmission  characteristics,  a  covert  channel  may  exist. 
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Definition:  A  temporally  neutral  transmission  schedule  is  one  in  which  none  of 
the  members  of  the  tuple  <  V,  F,  0,N,L  >  have  any  characteristic  that  can  be 
used  to  encode  any  information  and  communicate  surreptitiously  via  a  covert 
channel. 

The  transmission  schedule  that  satisfies  each  of  the  restrictions  in  <  V,  F,  0,N,L  >  is 
temporally  neutral.  Also,  our  definition  of  temporally  neutral  transmission  schedule 
includes  spatial  neutrality.  In  effect,  a  user  in  collusion  with  an  accomplice  should 
not  be  able  to  use  the  volume,  frequency,  order,  nature  of  communication,  and  the 
duration  (length)  of  transmission  in  the  network  to  exchange  information  surrepti- 
tiously and  should  not  be  able  to  gain  any  useful  information  regarding  the  traffic 
matrix,  source  or  destination  user  identity,  etc.,  just  by  observing  the  flow  of  packets 
on  the  network. 

4.4     Transmission  Schedules  for  Temporal  Neutrality 

In  this  section,  we  outline  the  static  and  adaptive  scheduling  policies  and  briefly 
discuss  the  potential  covert  channel  due  to  adaptive  scheduling  policy. 

4.4.1     The  Static  Scheduling  Policy 

We  are  given  an  n  x  n  neutral  traffic  matrix  and  our  goal  is  to  develop  a  tem- 
porally neutral  transmission  schedule.  The  solution  we  propose  uses  slotted  time 
to  transmit  packets.  The  period  contains  n(n  —  1)  active  slots  and  m  idle  slots. 
The  arrival,  buffering  and  departure  of  packets  in  a  period  was  discussed  earlier  in 
section  4.2. 
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Figure  4.4.  The  Static  Scheduling  Policy 
Figure  4.4  shows  the  transmission  schedule  for  a  period  as  per  the  static  schedul- 
ing policy.  The  new  arrivals  and  the  rerouted  packets  received  during  period  i  are 
eligible  for  transmission  during  period  («' + 1).  Nodes  1  and  4  have  at  least  one  packet 
in  the  buffer  eligible  for  transmission  at  the  beginning  of  period  i  and  are  scheduled 
for  transmission.  Note  that  at  the  beginning  of  period  i,  there  are  no  packets  in  the 
backlog  buffer  for  nodes  2  and  3.  This  implies  that  there  were  no  new  arrivals  or 
rerouted  packets  for  either  of  the  nodes  during  the  period  («  — 1);  dummy  packets  are 
generated  on  behalf  of  nodes  2  and  3.  The  figure  also  shows  the  arrival  of  packets 
from  nodes  3,  2,  1  and  4  destined  for  the  local  node.  Thus  exactly  one  packet  is 
received  and  transmitted  between  each  pair  of  nodes  in  period  i.  Note  that  in  period 
i,  a  packet  is  generated  in  the  local  node  destined  for  node  4,  routed  via  node  2.  The 
local  node  immediately  enqueues  a  packet  on  the  intermediate  node  2's  queue  and 
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marks  it  "destined  for  node  4".  In  the  next  period,  node  2  transmits  this  packet  to 
the  appropriate  destination  (node  4).  This  is  shown  in  the  figure  as  "4(via  2)".  The 
queuing  of  packets  in  the  virtual  queues  is  shown  in  dotted  lines. 

The  status  of  queues  at  the  beginning  of  period  (t  +  1)  can  be  explained  as 
follows:  the  packets  that  arrived  for  node  1,  3  and  4  in  period  i  are  enqueued  in 
the  queues  associated  with  nodes  1,3  and  4  respectively.  If  a  packet  is  not  scheduled 
for  transmission,  then  it  is  backlogged  at  this  period,  as  indicated  in  the  figure  by 
a  packet  marked  "B"  which  has  been  enqueued  and  not  scheduled.  The  queue  for 
node  4  shows  a  packet  backlogged  from  the  previous  period.  The  "4(via  2)"  arrival 
in  period  i  enqueues  a  packet  in  the  node  2's  queue.  Thus  each  node  has  at  least  one 
packet  to  transmit  in  its  queue. 

The  timing  and  order  of  transmission  (4,3,2,1)  and  the  order  of  packet  arrival 
(3,2,1,4)  remain  the  same  over  all  periods.  The  actual  order  is  not  important;  the 
order  of  transmission  could  be  something  as  simple  as  round-robin  order.  The  real 
issue  is  that  we  need  to  decide  an  order  and  ensure  that  it  is  strictly  followed  in  all 
periods  of  the  cycle.  The  volume  of  communication  between  each  node  has  to  be 
the  same  to  satisfy  the  spatial  neutrality  criterion.  If  we  build  the  communication 
protocol  on  a  NTCB  and  fix  extrinsic  packet  characteristics  like  the  packet  size  and 
encryption  algorithm,  we  can  be  reasonably  confident  of  satisfying  the  N  restriction 
of  <  V,F,0,N,L  >.  Since  this  is  a  static  policy,  there  are  exactly  n(n  —  1)  active 
slots  in  the  period  and  m  idle  slots.    This  implies  that  the  length  of  the  period  is 
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fixed  at  n(n  —  1)  +  m  and  remains  the  same  at  least  until  the  end  of  this  cycle.  Thus 
we  have  satisfied  the  V,  F,  0,  L  and  N  restrictions  of  the  tuple  <  V,  F,  0,  N,  L  >. 

We  have  shown  that  the  transmission  schedule  generated  by  the  policy  satisfies 
each  of  the  restrictions  in  the  tuple  <  V,  F,0,N,L  >.  Therefore  by  our  definition, 
the  transmission  schedule  generated  by  the  static  scheduling  policy  is  temporally 
neutral. 

4.4.2     The  Adaptive  Scheduling  Policy 

All  aspects  of  the  adaptive  transmission  policy  are  the  same  as  the  static  policy 
except  that  we  now  have  a  variable  number  of  idle  slots  in  a  period.  The  purpose 
of  the  idle  slots  in  the  period  is  that  the  scheduling  algorithm  can  now  adapt  to 
variations  in  load  to  satisfy  increased  bandwidth  requirements.  The  traffic  matrix  is 
required  to  be  neutral  and  each  node  will  exchange  one  packet  with  every  other  node 
in  the  network  per  period.  The  order  of  transmissions  is  maintained  the  same  for 
the  entire  cycle  and  the  extrinsic  characteristic  of  the  packets  do  not  change.  Since 
the  adaptive  scheduling  policy  is  similar  to  the  static  scheduling  policy,  we  can  see 
that  the  V,  F,  0  and  N  restrictions  of  <  V,  F,  0,N,L  >  are  satisfied  in  the  adaptive 
scheduling  policy.  The  only  restriction  that  we  cannot  satisfy  is  the  L  restriction  of 
the  tuple  <  V,F,0,N,L  >.  This  is  because  the  nodes  may  change  the  number  of 
idle  slots  changing  L,  the  length  (or  duration)  of  transmission.  Though  this  could 
potentially  introduce  a  low  bandwidth,  noisy  covert  channel,  we  feel  that  this  tradeoff 
may  be  acceptable  in  order  to  have  an  adaptive  scheduling  policy. 
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Figure  4.5.  The  Adaptive  Scheduling  Policy 
When  a  node  or  a  group  of  nodes  see  a  need  to  change  the  number  of  idle  slots  to 
ommodate  additional  traffic,  they  initiate  the  negotiation  process.  To  understand 
the  model,  details  of  the  negotiation  process  are  not  relevant  and  we  will  not  discuss  it 
further.  It  is  sufficient  to  realize  that  it  is  possible  for  the  nodes  to  agree  upon  a  new 
number  of  idle  slots  for  future  periods  of  the  same  cycle.  As  seen  in  figure  4.5,  after  a 
sustained  increase  in  the  load,  the  nodes  negotiate  and  decide  to  decrease  the  number 
of  idle  slots  per  period.  The  number  of  active  slots  in  the  period  remain  the  same, 
but  the  total  period  length  decreases  (by  one  slot),  thereby  increasing  the  utilization 
from  ,"("+,V  to  ,  "("+11  . .  Note  that  the  length  of  the  period  L  and  therefore  the 
transmission  characteristic  has  changed.  It  is  this  possibility  that  prevents  us  from 
guaranteeing  the  L  restriction  in  <  V,  F,  0,N,L  >  and  leaves  open  the  possibility  of 
a  covert  channel. 

It  should  be  noted  that  no  single  node  can  affect  the  number  of  active  and  idle 
slots  significantly  without  reaching  a  consensus  with  other  nodes  in  the  network. 
Therefore  the  potential  of  a  single  node  to  change  the  transmission  schedule  is  very 
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limited.  For  example,  an  user  may  try  to  change  the  load  on  a  particular  node  in  an 
attempt  to  change  the  transmission  characteristics,  which  could  be  observed  by  the 
accomplice  on  the  network,  thus  creating  a  covert  channel.  However,  in  response  to 
the  variation  in  the  load,  the  scheduling  policy  initiates  the  negotiation  protocol  to 
decide  on  new  transmission  characteristics.  Since  the  negotiation  for  new  transmis- 
sion characteristics  is  not  done  frequently  and  is  a  global  activity,  the  capacity  of  this 
covert  channel  is  very  low.  Also  any  eventual  changes  to  the  transmission  schedule 
after  the  negotiation  process  is  due  to  the  cumulative  effects  of  several  individual 
node's  (user's)  actions  and  view  of  the  network  and  the  effects  of  any  single  node  on 
the  transmission  characteristics  is  relatively  minor.  If  a  node  is  using  all  its  capacity 
and  wants  to  increase  its  traffic  to  a  particular  node  by  k  packets,  then  due  to  the 
spatial  neutrality  criterion,  it  must  increase  its  traffic  by  a  factor  of  kn.  Also  due  to 
the  non-local  effect  of  rerouting,  traffic  on  other  nodes  are  affected  as  well  and  there 
might  exist  some  excess  capacity  after  negotiation.  Therefore  the  covert  channel  has 
low  capacity  and  is  very  noisy. 

Having  shown  the  possibility  of  existence  of  a  covert  channel,  we  now  suggest 
mechanisms  to  reduce  the  capacity  of  the  covert  channel,  if  not  eliminate  it. 

•  No  idle  slots 
If  we  use  the  network  at  full  capacity  as  allowed  by  the  protocol,  we  can  completely 
do  away  with  the  idle  slots  in  a  period  and  thus  eliminate  any  possibility  of  covert 
channels  according  to  our  definition  of  temporally  neutral  transmission.  However  the 
scheduling  policy  degenerates  to  a  simple  static  scheduling  policy.  Secondly,  if  a  node 
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is  using  all  its  capacity,  i.e.,  there  are  no  idle  slots  in  a  period  then  the  scheme  is  very 
costly  because  the  volume  of  true  traffic  may  be  only  a  small  fraction  of  the  capacity 
being  used. 

•  Renegotiate  transmission  characteristics  at  cycle  boundaries 
In  this  option,  we  restrict  the  times  at  which  the  scheduling  policy  can  respond  to 
variations  in  the  load.  Since  the  cycle  length  is  considerably  longer  than  the  period 
length,  the  nodes  will  have  to  buffer  all  the  packets  generated  due  to  the  additional 
load  (in  this  cycle)  and  dispatch  them  at  the  usual  rate.  The  nodes  have  to  wait 
until  the  beginning  of  a  new  cycle  before  the  period  characteristics  can  be  changed. 
This  could  introduce  severe  queuing  delays  and  adversely  affect  the  Quality  Of  Service 
(QOS)  requirements.  In  fact  it  is  entirely  possible  that  by  the  time  a  cycle  terminates, 
the  load  on  the  network  has  smoothed  out  and  there  is  no  necessity  to  renegotiate 
the  active  and  idle  time  slots.  In  this  case,  the  user  tried  to  create  a  covert  channel, 
but  was  unsuccessful  and  no  information  was  communicated  at  all.  Since  the  cycle 
boundaries  are  far  apart,  the  capacity  of  the  covert  channel  is  considerably  reduced. 
The  key  to  the  success  of  this  mechanism  is  that  the  transmission  parameters  are 
constant  over  long  durations,  i.e.,  the  negotiations  are  few  and  far  apart  and  the 
nodes  decide  to  use  additional  capacity  in  small  increments. 

4.5     Static  versus  Adaptive  Scheduling  Policy 
The  static  scheduling  policy  is  efficient;  the  computation  costs  are  negligible 
and  is  incurred  only  once  at  the  beginning.     Also  there  is  no  cost  overhead  due 
to  the  scheduling  policy  itself  in  terms  of  the  number  of  packets  transmitted  as 
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no  additional  information  is  exchanged  between  the  hosts  to  implement  the  protocol 
during  actual  transmission  of  the  packets.  However,  at  the  beginning  of  transmission, 
the  nodes  need  to  negotiate  the  transmission  parameters  and  for  long  transmission 
sequences  (i.e.,  for  a  long  cycle),  this  cost  is  amortized  over  successive  phases  of  the 
transmission.  In  the  static  scheduling  policy,  the  tradeoff  is  between  wasted  capacity 
and  poor  Quality  of  Service  (QOS)  versus  the  elimination  of  covert  channels. 

As  against  a  completely  secure  static  transmission  policy,  we  can  adopt  a  more 
responsive  transmission  policy  with  the  penalty  that  there  might  exist  a  covert  chan- 
nel. However,  by  ensuring  that  the  transmission  parameters  remain  the  same  over 
long  cycles,  there  are  fewer  chances  to  change  the  transmission  parameters  thereby 
reducing  the  observable  variation  of  the  network  traffic  characteristics.  Thus  we  have 
a  low  capacity,  noisy  covert  channel. 

In  times  of  crisis,  the  V,  F,  0,  N  and  L  characteristics  could  change  signifi- 
cantly and  the  scheduling  policies  could  suffer  performance  degradation.  Under  both 
scheduling  policies,  when  the  load  is  low  or  constantly  varying  or  when  the  number 
of  nodes  in  the  network  is  large,  the  penalty  incurred  could  be  quite  high.  In  the 
static  scheduling  policy,  we  have  to  pad  the  actual  traffic  with  dummy  packets.  This 
implies  that  we  may  see  too  high  a  total  load  relative  to  the  actual  traffic.  In  the 
adaptive  scheduling  policy,  variation  in  load  causes  additional  traffic  to  be  backlogged 
and  the  packets  would  suffer  significant  transmission  delays.  However  we  feel  that 
by  assigning  dynamic  priorities  to  packets,  we  can  ensure  speedy  delivery  of  certain 
packets  at  the  expense  of  regular  traffic. 
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4.6     Medium  Access  Issues 

In  this  section,  we  discuss  the  transport  layer  and  medium  access  issues  related 
to  the  transmission  of  packets  according  to  the  static  or  adaptive  scheduling  policies 
discussed  above. 

According  to  the  scheduling  policies,  each  node  exchanges  one  packet  with  every 
other  node  in  the  network  during  each  period.  Individual  packets  are  transmitted  at 
fixed  slot  times.  While  the  notion  of  slot  time  is  very  obvious  at  the  physical  layer,  at 
the  medium  access  (and  transport)  layer,  we  define  slot  time  to  indicate  the  amount 
of  time  it  takes  a  node  for  the  eventual  scheduling  and  successful  transmission  of  a 
packet.  Since  our  focus  in  this  chapter  is  the  use  of  scheduling  policies  to  achieve 
temporal  neutrality,  we  will  discuss  the  basic  operation  of  medium  access  in  this  con- 
text. The  issues  of  interest  include  a  synchronization  mechanism  to  assure  a  strictly 
ordered  access  to  the  transmission  medium  by  the  nodes  to  ensure  that  each  node 
exchanges  exactly  one  packet  with  every  other  node  in  a  particular  order.  We  assume 
that  the  nodes  are  built  on  Trusted  Computing  Base  (TCB)  and  access  the  network 
via  a  Trusted  Network  Interface  (TNI).  We  also  assume  that  the  transmissions  are 
error  free. 

The  constraints  imposed  by  the  scheduling  policy  greatly  help  in  simplifying  the 
implementation  of  the  scheduling  policy  at  the  transport  layer.  As  the  scheduling 
policy  dictates  the  order  of  transmission,  we  can  imagine  a  token  being  passed  from 
a  node  that  is  currently  transmitting  a  packet  to  the  node  that  is  its  successor  in  the 
transmission  order.  Since  we  assume  a  point  to  point  network,  this  token  passing  is 
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similar  to  that  of  IEEE  803.5  token  ring  protocol.  On  the  other  hand,  if  we  assume 
a  broadcast  medium,  then  the  token  passing  is  similar  to  that  of  IEEE  802.4  token 
bus  protocol. 

For  the  correct  transfer  of  token  (i.e.,  permission  to  access  the  transmission 
medium)  and  the  data  packets,  nodes  need  to  achieve  and  maintain  synchronization. 
Synchronization  can  be  enforced  by  the  scheduling  policy  at  the  beginning  of  each 
period,  marked  by  the  transmission  of  n  X  (n- 1)  packets  (zero  packets  for  the  period). 

In  order  to  ensure  that  the  frequency  of  transmission  is  the  same,  the  protocol 
computes  the  maximum  time  required  for  the  successful  transmission  of  a  scheduled 
packet.  Assuming  that  there  are  no  transmission  errors,  each  node  waits  for  this 
time  interval  after  the  scheduling  of  a  packet  by  its  predecessor  before  attempting 
to  transmit  its  own  packet.  Since  the  scheduling  policy  requires  that  the  all  packets 
be  of  the  same  size  (to  satisfy  the  N  criterion  of  <  V,  F,  0,N,L  >  requirement),  the 
time  interval  can  be  precomputed.  Any  drift  in  time  can  be  compensated  during  the 
synchronization  phase  at  the  beginning  of  each  new  period. 

For  example,  consider  a  network  with  three  nodes  named  a,  6,  and  c.  Let  us 
assume  that  the  scheduling  policy  requires  the  order  of  transmission  to  be  <  a,c,  b  > 
and  let  the  time  to  transmit  a  packet  be  Tt.  During  the  first  period,  after  the 
initialization  phase,  a  transmits  a  packet  to  node  c  (represented  as  a  — ►  c)  followed 
by  a  transmission  to  node  a  — ►  b.  Node  c,  which  is  the  next  node  to  transmit,  waits 
for  Tt  units  of  time  after  the  beginning  of  transmission  a  —*  b,  before  transmitting 
c  — >  a  followed  by  the  transmission  c  — >  b.    Node  b  now  waits  for  Tt  units  of  time 
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before  initiating  is  transmissions  b  — >  a  and  b  — >  c.  Synchronization  and  ordering 
of  successive  transmissions  from  the  same  node  to  different  destinations  is  easily 
handled. 

Thus  one  period  of  transmission  is  completed  where  each  node  has  exchanged 
exactly  one  packet  with  every  other  node  in  the  network.  Now  the  nodes  enter  a  syn- 
chronization phase  (if  necessary)  following  which  they  commence  the  transmissions 
required  of  the  next  period.  This  continues  until  either  all  the  traffic  dictated  by 
the  current  traffic  matrix  has  been  transmitted  or  a  sustained  load  change  necessi- 
tates renegotiation  of  transmission  characteristics.  In  either  case,  new  transmission 
characteristics  are  negotiated  for  future  transmissions  and  the  nodes  enter  the  syn- 
chronization phase  before  transmission  beings  for  the  next  period. 

The  above  transmission  policy,  while  accomplishing  the  goals  of  generating  a 
temporally  neutral  transmission  schedule,  has  the  drawback  of  under-utilizing  com- 
munication resources.  For  example,  even  if  each  node  in  the  network  has  the  capabil- 
ity to  transmit  on  multiple  channels,  currently  it  uses  only  one  channel  to  exchange 
packets  with  other  nodes  in  the  network.  This  drawback  can  be  addressed  by  design- 
ing a  packet  scheduling  policy  that  uses  the  available  channels  to  exchange  packets  in 
parallel.  The  model  would  still  require  that  each  channel  satisfy  the  <  V,  F,  0,N,L  > 
restrictions  for  temporal  neutrality.  This  might  lead  to  an  increase  in  the  volume  of 
dummy  packets  transmitted.  We  believe  that  the  basic  tradeoff  here  is  between  the 
nature  and  severity  of  threat  faced  by  the  secure  network  installation  versus  the  uti- 
lization of  the  hardware  and  communication  resources.  The  more  severe  the  threat, 
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the  scheduling  policies  will  less  flexible  (or  adaptive)  and  therefore  lead  to  poorer 
utilization  of  resources. 

As  in  IEEE  802.4  or  IEEE  802.5  protocols,  it  is  possible  to  implement  a  priority 
mechanism  with  this  protocol.  However,  we  will  not  discuss  this  issue  further  in  this 
thesis.  Issues  such  as  connection  establishment  and  maintainability,  fault  manage- 
ment and  recovery,  etc.,  though  important,  are  beyond  the  scope  of  this  dissertation 
and  we  will  not  discuss  them  any  further. 

4.7     Conclusion 

In  this  chapter,  we  formalize  the  notion  of  a  temporally  neutral  transmission 
schedule  and  propose  two  scheduling  policies:  the  static  scheduling  policy,  which  gen- 
erates temporally  neutral  transmission  schedules  and  the  adaptive  scheduling  policy, 
which  generates  temporally  neutral  transmit  schedules  with  the  possibility  of  a  low 
capacity  covert  channel.  The  adaptability  of  the  adaptive  scheduling  policy  can  be 
traded  off  for  the  existence  of  a  low  capacity  and  noisy  covert  channel.  Depending  on 
specific  operating  environment  and  the  degree  of  perceived  threat  to  an  installation 
by  an  intruder,  an  appropriate  scheduling  policy  may  be  selected.  The  ability  to  tol- 
erate covert  channels,  required  responsiveness  and  the  cost  to  generate  a  temporally 
neutral  transmission  schedule  are  three  useful  metrics  to  help  select  and  evaluate 
transmission  schedules. 

In  the  next  chapter,  we  interpret  our  model  as  a  mode  based  security  system 
and  use  Millen's  method  to  estimate  cover  channel  capacity.  We  derive  useful  results 
for  general  cases  of  channel  capacity. 


CHAPTER  5 
COVERT  CHANNEL  CAPACITY  ESTIMATION 


5.1     Introduction 

The  previous  chapter  proposed  a  mechanism  to  generate  transmission  schedules 
that  will  satisfy  our  primary  goals  of  prevention  of  traffic  analysis  and  the  prevention 
of  covert  channels  due  to  temporal  variation  in  a  packet  transmission  schedule.  In  the 
static  scheduling  policy,  the  transmission  schedule  is  fixed  and  the  nodes  just  follow 
the  transmission  schedule  at  all  times,  immune  to  variations  in  the  system  load.  Since 
this  transmission  schedule  is  known  and  remains  the  same  over  a  period  of  time, 
there  can  be  no  covert  channels  due  to  temporal  variations  in  traffic  characteristics. 
However,  such  a  scheduling  policy  is  not  responsive  to  changes  in  the  load  and  can 
degrade  system  utilization  and  performance  significantly.  We  therefore  propose  an 
adaptive  scheduling  policy  that  can  adapt  to  variations  in  the  system  load,  but  which 
leaves  open  the  possibility  of  a  covert  channel,  as  explained  in  Chapter  4. 

In  this  chapter,  we  are  concerned  with  estimating  the  capacity  of  network  covert 
channels.  Static  scheduling  policy  eliminates  covert  channels  due  to  variations  in 
traffic  characteristics;  adaptive  scheduling  policy,  however,  seeks  to  improve  resource 
utilization  and  decrease  cost  by  reducing  unnecessary  padding  and  improve  respon- 
siveness at  the  expense  of  allowing  certain  covert  channels. 
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In  our  model,  we  pre-allocate  transmission  capacity  to  nodes  in  the  network 
to  eliminate  most  of  the  covert  channels  due  to  temporal  variations  in  transmission 
characteristics.  This  allocation  of  transmission  capacity  is  enforced  by  a  transmission 
schedule.  We  adopt  a  slotted  time  system  in  which  time  is  partitioned  into  a  sequence 
of  cycles  composed  of  periods.  Each  period  is  in  turn  composed  of  active  and  idle  slots. 
In  order  to  be  able  to  respond  to  changes  in  the  system  load,  the  nodes  periodically 
(typically  at  the  end  of  a  cycle)  renegotiate  the  number  of  idle  slots  in  the  period, 
thereby  changing  the  transmission  characteristics.  The  time  interval  during  which 
the  transmission  characteristics  remain  fixed  is  called  a  "state"  (or  "mode").  See 
figure  5.1.  When  the  network  operates  in  a  mode,  the  number  of  active  and  idle  slots 
are  fixed  and  therefore  there  can  be  no  covert  channel. 

Renegotiation  of  transmission  characteristics  to  accommodate  changing  load 
causes  mode  changes  that  may  lead  to  covert  channels.  As  the  number  of  active 
slots  and  their  use  is  fixed  by  the  transmission  schedule  and  remains  unchanged  over 
successive  periods,  they  do  not  contribute  to  the  covert  channel  capacity.  The  change 
in  the  number  of  idle  slots  within  a  period  creates  a  potential  covert  channel;  the 
number  of  idle  slots  within  a  period  constitutes  a  symbol. 

In  the  following  section,  we  use  the  information  theory  based  approach  due  to 
Shannon  et  al.[49]  and  its  application  by  Millen[31]  to  compute  the  covert  channel 
capacity  in  the  limit.  Our  analysis  in  section  5.3  is  motivated  by  the  mode  secure 
system  model  for  covert  channel  capacity  suppression  proposed  by  Browne[7].  We 
describe  our  model  as  a  mode  secure  system  and  discuss  covert  channel  capacity 
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Figure  5.1.  State  Transition  Diagram  for  A  Mode  Secure  System 
estimation  techniques.    In  chapter  6,  we  discuss  the  auditability  of  network  covert 
channels  and  define  audit  parameters  based  on  our  measurements  on  UFNET. 
5.2     Capacity  Analysis  Using  Information  Theory 

We  start  with  the  assumption  that  there  are  at  least  two  nodes  (in  a  network  of 
n  nodes)  at  different  security  levels  that  intend  to  communicate  via  a  covert  channel. 
Each  period  has  n(n  -  1)  active  slots  to  transmit  actual,  reroute  and  dummy  packets 
and  m  idle  slots  during  which  the  nodes  remain  idle.  Let  M  -  1  be  the  maximum 
number  of  idle  slots  allowed  in  any  period.  Therefore  the  number  of  idle  slots,  m  6 
[0---M-1]. 

The  system  is  said  to  be  in  state  i  if  the  number  of  idle  slots  in  the  period  is 
i.  The  system  remains  in  state  i  during  the  time  interval  when  the  number  of  idle 
slots  remain  unchanged.  When  the  number  of  idle  slots  in  a  period  changes  from  i 
to  j,  we  say  that  the  system  has  made  a  transition  from  state  i  to  state  j  (or  a  mode 
change). 
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Changes  in  the  system  load  causes  changes  in  transmission  characteristics  achieved 
by  negotiation  process  initiated  by  the  sender  of  a  covert  message.  From  our  defi- 
nition of  a  state  and  a  state  transition,  it  is  clear  that  if  the  system  remains  in  the 
same  state  then  there  is  no  covert  channel;  it  is  the  transition  from  one  state  to  an- 
other that  may  cause  a  potential  covert  channel  to  exist.  The  capacity  of  the  channel 
depends  on  the  frequency  of  state  transitions  and  the  number  of  distinct  states  to 
which  the  system  can  transition  immediately  after  the  current  state. 

To  obtain  the  maximum  information  rate,  we  assume  that  the  sender  is  capable 
of  significantly  affecting  the  system  load,  either  by  itself  or  in  collusion  with  other 
processes,  thereby  necessitating  a  renegotiation  of  the  transmission  characteristics. 
We,  however,  assume  that  the  channel  is  noiseless,  i.e.,  other  nodes  in  the  network 
do  not  attempt  to  change  the  system  load  or  use  the  covert  channel  when  the  sender 
and  receiver  processes  are  active.  To  realize  the  maximum  channel  capacity,  the  state 
transitions  must  occur  at  the  end  of  each  period  and  the  system  should  be  able  to 
make  a  transition  to  any  one  of  the  possible  states.  Maximum  attainable  capacity  is 
influenced  by  the  maximum  number  of  idle  slots,  M  —  1,  and  is  realized  if  the  sender 
is  capable  of  transmitting  one  symbol  each  period  by  modulating  the  number  of  idle 
slots  at  the  end  of  each  period. 

To  send  a  distinct  symbol  in  a  period: 

•  all  nodes  follow  the  transmission  schedule  during  active  slots;  the  sender  seeks 
to  change  the  transmission  characteristics  by  changing  the  effective  load  on  the 
system; 


•  the  sender  initiates  the  renegotiation  process  to  change  transmission  character- 
istics, and  consequently  to  modulate  m; 

•  nodes  agree  to  change  the  number  of  idle  slots  to  m  at  the  end  of  the  period, 
where  0  <  m  <  M; 

•  the  receiver  sees  m  idle  slots  at  the  end  of  the  period  and  records  the  symbol 
m. 

Therefore  the  symbol  that  is  transmitted  in  state  m,  is  encoded  by  the  number 
of  idle  slots  in  that  state.  Note  that  it  may  be  more  practical  to  change  the  number 
of  idle  slots  in  the  next  period  rather  than  in  the  current  period;  doing  so  does  not 
affect  the  channel  capacity. 

Note  also  that  the  transition  to  the  next  state  does  not  depend  on  the  current 
or  previous  states;  however,  symbols  are  of  varying  lengths  and  therefore  we  have 
varying  transition  durations.  No  explicit  synchronization  is  required  between  the 
sender  and  the  receiver;  the  nodes  use  the  implicit  synchronization  points  at  the 
period  or  cycle  boundaries.  The  slotted  time  system  helps  keep  the  sender  and  the 
receiver  synchronized  without  incurring  any  additional  synchronization  overhead. 

5.2.1     Channel  Capacity:  General  Case 

From  Shannon  et  al.[49],  the  capacity  C  of  a  noiseless  discrete  channel  with 
symbols  of  different  length  is 
C  =  lim^flogj  Nj(t))/t 
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where  Nj(t)  is  the  number  of  possible  symbol  sequences  taking  time  t  to  transmit 
beginning  in  state  j. 

In  our  model,  as  the  symbol  transmitted  in  the  next  period  does  not  depend  on 
the  current  or  previous  state,  we  drop  the  subscript  j.  If  N(t)  represents  the  number 
of  sequences  of  duration  t,  we  have, 

N{t)  =  £  N(t-  U)  =  N(t-U)  +  N(t-t2)  +  ---  +  N{t  -  tn). 
where  i,  is  the  time  required  to  send  symbol  S;. 

Given  a  set  of  symbols  Si,  Sj, •  ■  • ,  Sn  of  duration  t1,t2,t3,  ■  ■  ■  ,t„  respectively, 
the  above  relation  can  be  justified  as  follows.  N(t  -  <,)  represents  the  number  of 
sequences  starting  with  the  symbol  S,  taking  exactly  time  t.  After  transmission  of 
symbol  Si,  the  time  remaining  for  the  rest  of  the  sequence,  if  any,  is  ((  —  £;). 

Since  N(t  —  £;)  includes  all  messages  sequences  taking  time  t  and  starting  with 
symbol  Si,  J2i  W(i  —  i;)  includes  all  possible  message  sequences  that  can  be  trans- 
mitted in  time  (.  The  total  number  is  equal  to  the  sum  of  the  numbers  of  se- 
quences given  by  N(t  —  ti),N(t  —  is),'  ■  ■  ,N(t  —  tn).  According  to  a  well  known 
result  in  finite  differences,  N(t)  is  then  asymptotic  for  large  t  to  AX*Q  where  A  is 
constant  and  Xo  is  the  largest  positive  real  solution  of  the  characteristic  equation 
X-''  +X~'*  +  ---  +  X-'"  =  1. 


C  =  linw  *&£& 


limt-tco  ;  —  umf-^oo    °      I       , 


log^o 
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Each  positive  real  solution  for  X  yields  an  asymptotic  value  for  an  achievable  infor- 
mation rate;  the  channel  capacity  is  calculated  from  the  largest  root. 

In  the  following  section,  we  derive  general  bounds  for  channel  capacity  using 
Shannon's  results  from  information  theory.  In  deriving  the  channel  capacity,  our 
approach  is  similar  to  Millen[31]. 

5.2.2     Channel  Capacity:  General  Bounds 

To  compute  the  maximum  capacity,  we  must  determine  the  number  of  distinct 
sequences  that  can  be  transmitted  during  a  specified  time  interval,  given  the  time 
required  to  transmit  a  symbol  Sm,  is  tm  =  n(n  —  1)  +  m.  tm  is  normalized  here  to 
slot  time. 

Let  us  fix  the  maximum  number  of  idle  slots  in  a  period  to  be  M  —  1.  Let  us 
also  indicate  the  number  of  idle  slots  in  period  i  by  m^,  where  0  <  m;  <  M.  Then 
let  Nm{1)  represent  the  number  of  symbol  sequences  of  duration  t  using  M  symbols 
and  let  Cm  be  the  asymptotic  capacity  of  the  channel  with  maximum  number  of  idle 
slots  being  M  —  1. 
Therefore, 

Cm  =  lim(JOO(log2  NM(t))/t  and, 

NM(t)  =TJ!Lo1  NM(t-U)  =  NM(t-t0)  +  NM(t-t1)  +  NM(t-t2)  +  ---  +  NM{t-tM-1) 
This  is  a  difference  equation  with  a  characteristic  equation  of  the  form 
1  =  xd  +  xd+'  +  ■■■  +  x^M-1 
where 
ti  —  d-\-i  and  d  =  n(n  —  1)  for  n  nodes. 
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The  logarithm  of  the  largest  positive  root  of  the  characteristic  equation  is  the  covert 
channel  capacity.  This  equation  might  have  to  be  solved  numerically  since  the  equa- 
tion may  be  of  high  degree. 

To  derive  that  number  of  sequence  that  can  be  transmitted,  we  consider  the 
following  two  cases. 
Case  1:  t  <  M 

Niuit)  is  the  number  of  symbol  sequences  that  require  exactly  t  time  units  to  be 
transmitted. 
Claim  1: 

VM  >  1,  if  U  =  1  +  i  for  i  e  [0  •  •  •  M  -  1],  then  Vt,  0  <  t  <  M,    NM(t)  =  2'"1 
Discussion  of  Claim  1 
From  Section  5.2.1,  we  know  that 
NM(t)  =  NM(t  -  1)  +  NM(t  -  2)  +  ■  •  •  +  NM(t  -(t-  1))  +  NM(0) 

We  let  Nm{0)  =  1  by  convention.   This  is  because  if  t  =  t%  +  <2,  then  N(t)  > 
N(t!)N(t2).  However,  if  t  =  tu  then  N{t)  =  N{t)N(0).  Therefore  N(0)  needs  to  be 
1  for  the  identity  to  hold.    Note  that  the  empty  string  is  the  only  sequence  taking 
time  0  to  transmit,  i.e.,  iV(0)  =  1. 
Case  2:  t  >  M 

Let  the  number  of  symbols  be  M  and  t,-  =  1  + 1  for  i  6  [0  •  •  ■  M  ~  1], 
Claim  2 

VM  >  1,  if  U  =  1  + 1  for  »  6  [0-  ■  ■  M  - 1],  then  Vfc  >  0, 
NM(M  +  k  + 1)  =  2NM(M  +  k)-  NM(k) 
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Proof  of  Claim  2 

Base  Case 

From  the  difference  equations,  we  know  that 

NM(M  +  k)  =  NM(M  +  k  - 1)  +  NM(M  +  k  -  2)  +  •  ■  •  +  NM(k) 

For  k  =  0,  the  total  number  of  sequences  in  time  t  =  M  +  1,  is  given  by 

NM(M  +  0  +  1)=  NM(M)  +  NM(M  -  1)  +  ■  •  ■  +  NM(2)  +  AMI) 

=  NM(M)  +  AWM  -  1)  +  •  ■  •  +  AfM(l)  +  JVm(O)  -  A'm(O) 

=  NM{M)  +  ATM(M)  -  iVM(0) 

=  2NM(M)  -  NM(0) 
Therefore  the  Claim  is  true  for  k  =  0. 
Induction  Step 

Suppose  the  claim,  NM(M  +  k  +  1)  =  2NM(M  +  k)  -  NM(k)  is  true  Vfc  <  A' 
Now  for  k  =  K  +  1,  we  have, 
AW(M  +*+  1)=  NM(M  +  A-  +  1  -  1)  +  7VM(M  +  k  +  1  -  2)  +  •  •  •  +  NM(k  +  1) 

=  NM(M  +  k)  +  AW(M  +  fc  -  1)  +  ■  •  •  +  NM(k  +  1) 

=  NM(M  +  k)  +  NM(M  +  k -  1)  +  -••  +  AV(*0  -  %,(*) 

=  NM(M  +  fe)  +  NM(M  +  ife)  -  iVM(k) 

=  2NM(M  +  k)-  NM{k) 
Therefore  Claim  2  in  Part  B  is  true  Vfc  >  0. 

The  recurrence  relation,  in  its  stated  form,  is  suitable  for  the  computing  the  number 
of  message  sequences  that  can  be  transmitted,  A'm(i),  t  =  M  +  k  +  1,  Vfc  >  0,  by 
maintaining  a  table  of  intermediate  values  of  NM(t)-    The  closed  form  solution  for 
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the  case  when  t  <  M  will  provide  the  required  initial  values  for  the  generation  of  the 

table. 

Proposition  5.1:  NM(t)  <  NM{?)  if  t  <  t' 

This  proposition  states  that  given  a  set  of  symbols,  the  number  of  symbol  se- 
quences that  can  be  composed  increases  with  increasing  time  duration,  given  that  we 
have  a  symbol  that  takes  unit  time  to  transmit,  i.e.,  3  t\. 

Proof  of  Proposition  5.1: 

Part  A:  t  <  M 

From  Case  1,  we  know  that, 

NM(t)  =  2'"1  if  t  <  M  and 

NM(t)  <  NM(t  +  1),  Vi  <  M 

Therefore  NM(t)  =  2'"1  and  NM(t')  =  21'"1. 

as  t  <  t',  2'-1  <  2*'-\  we  have 

NM(t)  <  NM(t'). 

As  t  increases,  the  number  of  symbol  sequences  increases  and  therefore  Proposition 

5.1  is  true  for  Part  A. 

Part  B:  t  >  M 

In  Case  2  above,  we  proved  the  relation 

NM{M  +  k  +  1)  =  2NM(M  +  k)  -  NM(k)  if  t>  M  holds  for  all  fc  >  0. 

Now, 

NM(M  +  k  +  1)  =  2NM(M  +  k)  -  NM(k) 

=  NM(M  +  k)  +  NM(M  +  k)-  NM{k) 
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We  can  easily  show  that  NM(M  +  k)  -  NM(k)  >  0 

Therefore  NM(M  +  k  +  1)  >  NM(M  +  k) 

From  the  discussion  in  Part  A  and  B,  Proposition  5.1  is  true. 

QED  Proposition  5.1 

Proposition  5.2:  NM(t)  <  NM,(t)  if  M  <  W 

This  proposition  states  that  given  the  same  time  duration,  the  number  of  symbol 

sequences  that  can  be  composed  with  M  symbols  is  less  than  or  equal  to  the  number 

of  symbol  sequences  that  can  be  composed  with  M'  symbols,  M  <  M'. 

Proof  of  Proposition  5.2: 

We  know  that, 

NM(t)=  E^o1  NM(t  -  k)  =  NM(t  -  to)  +  NM(t  -*i)  +  ---  +  NM(t  -  tM-i) 

and, 

NM,(t)=  E^o"1  AM*  ~  U)  =  AM*  -  t0)  +  AM*  -<!)  +  •••  +  NM,(t  ~  tw-i) 

Subtracting  JVm(<)  from  N\t'(t),  we  get 

AM*)  -  nmW  =  (AM*  -  to)  +  AM*  -h)  +  NM>{t  -  t2)  +  ■  ■  ■  +  NM,(t  -  <M»_,)) 

-(NM(t  -  t0)  +  NM{t  -  tt)  +  NM{t  -h)  +  ---  +  NM{t  -  tM-i)) 
Combining  similar  terms, 
AM*)  -  NM{t)  =  [AM*  -  to)  -  NM(t  -  to)]  +  [AM*  -  U)  -  NM{t  -  U]\  +  ■■■  + 

[AM*  -  *M-i)  -  NM(t  -  tM-i)\  +  AM*  -  tM)  +  NM'(t  -  tM+i) 

+  ---  +  AM*-*M'-i) 
=  AM*  -  tu)  +  AM*  -  *m+i)  +  ■■■  +  AM*  -  tM'-i) 
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If  Vt  <  t,  NM'{t)  >  Nm{t),  then  the  first  M  terms  are  greater  or  equal  to  zero  and 

NM'(t)  >  0  if  t  >  0,  then  the  last  M'-M  terms  are  also  greater  than  or  equal  to  zero. 

Therefore  NM'(t)  -  NM(t)  is  positive  and  we  have  the  desired  result  N\i(t)  <  NM'{t)- 

QED  Proposition  5.2 

Corollary  5.1: 

NM(t)  <  NM,(t'),  VM,  t,  M',  t',  where  1  <  M  <  M'  and  0  <  t  <  t' 

This  corollary  claims  that  by  increasing  the  number  of  symbols  available  and  by 
increasing  the  duration  available  for  the  transmission  of  allowed  symbol  sequences, 
the  total  number  of  symbol  sequences  that  can  be  transmitted  increases. 

Using  Proposition  5.1  and  5.2,  the  corollary  can  be  easily  proved. 
The  significance  of  this  corollary  in  our  model  is  that  as  we  increase  the  number  of 
idle  slots  in  a  period,  we  can  encode  more  symbol  sequences  and  therefore  see  an 
increased  covert  channel  capacity. 
QED  corollary  5.1 
Corollary  5.2: 
CM,t  <  CM',t,  if  M'  >  M  and  t>  0 

The  corollary  claims  that,  in  general,  the  capacity  of  a  channel  increases  if  a 
larger  set  of  symbols  are  available  for  transmission,  assuming  that  there  is  sufficient 
time  for  the  transmission  of  those  symbols. 
Proof  of  corollary  5.2: 
We  know  that  CM,t  =  '°g^M('). 
From  Proposition  5.2,  we  know  that  the  number  of  symbol  sequences, 
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NM(t)  <  NM'{t)  if  M  <  M',  so 
0ut  =  '-^^^  <  '-^MiiH  =  cM,,t 
QED  corollary  5.2 
Corollary  5.3: 
Cm,(  <  Cm/,  if  t  >  2 

This  corollary  claims  that,  in  general,  the  capacity  of  a  channel  increases  with  in- 
creasing time  available  for  transmission  of  a  given  set  of  symbols. 
Proof  of  corollary  5.3: 
We  know  that  CM,t  =  NM(t)/t. 
We  can  restate  corollary  5.2,  as 
CM,t  <  Ct,u  Vt  >  0  and  M  <  t. 
This  gives  us, 

ft       _   log,  W,(t)   _   log,(2'-')    _   l-i 
°t,t  —  t  —  t  t 

t'  >  t  implies,  *=*  <  '-fn 

giving  us  C(t(  <  C(>,c  for  t  <  t' 

From  corollary  5.1,  we  can  get  Cju.t  <  Cm'.v  V  M  <  M'  and  t  <  t'.   Therefore  we 

have  Cm,v  <  Ct',v- 

Therefore  Cmj  <  Ct,t  <  Cm,v  <  Ce,i'>  it  t  <t' 

QED  corollary  5.3 

Theorem  5.1:  Limiting  Capacity  of  the  Channel 

The  limiting  capacity  of  the  channel  is 

Coo  =  limM-.oo(lim,^co  Cm,t)  =  1 
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This  theorem  states  that  as  the  number  of  symbols  available  and  the  time  du- 
ration under  consideration  approaches  infinity,  the  channel  capacity  asymptotically 
approaches  the  maximum  channel  capacity. 
Proof  of  Theorem  5.1: 

Coo  =  Gm,i  =  limM-.oo(lim(_00  Cm,t)  =  lim(-00(limM-.co  Cm*) 

We  can  interchange  the  limits  on  the  above  equation  as  both  the  series  converge  when 
t  — >  oo  and  M  — >  oo. 

Coo=  Cut  =  Hindoo (limM^oo  Cm,i) 
=  lim^0O(limM^cc!2EiTMm) 
=  limbec  (limM-oo  -^ — ) 

=  lim(_00(limjtf_00  s Lr2-) 

u  ((-l) 

=  hmt-.oo  !-T-1 

=  1 

QED  Theorem  5.1 

Therefore,  we  conclude  that  if  we  have  an  infinite  number  of  symbols  and  infinite 
time  to  send  the  message  sequences,  the  channel  capacity  can  attain  its  theoretical 
maximum  at  C^.  However,  if  either  the  number  of  symbols  or  the  time  available  is 
finite,  then  the  channel  capacity  will  be  close  to  but  will  never  attain  the  theoretical 
maximum  capacity.  From  a  practical  standpoint,  as  the  coding  technique  becomes 
more  ideal,  the  delays  incurred  in  the  process  of  coding  becomes  longer.  Therefore 
there  is  a  tradeoff  between  the  gain  in  transmission  time  due  to  efficient  coding  against 
coding  delay. 
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In  this  section,  we  assumed  that  the  number  of  idle  slots  in  a  period,  m,  6 
[0  •  •  ■  M  —  1]  and  derived  simple  relations  to  compute  the  number  of  symbol  sequences 
that  can  be  transmitted  in  the  allowed  time  t.  However,  as  our  model  specifies 
d  =  n(n  —  1)  active  slots  per  period,  we  consider  that  case  in  the  following  section. 

5.2.3     Channel  Capacity:  m  <F  [0  ■  ■  ■  M  -  1] 

In  our  model,  each  period  consists  of  d  active  slots  followed  by  a  maximum  of 
M  —  1  idle  slots  (see  figure  4.1).    Therefore  a  time  interval  of  tj  elapses  between 
two  successive  symbols  that  can  be  transmitted  via  the  covert  channel  (time  tj  is 
normalized  to  slot  times).  In  other  words,  the  active  slots  have  the  effect  of  "delaying" 
two  successive  symbols  by  a  time  interval  tj.  Therefore,  the  covert  channel  will  suffer 
a  capacity  degradation  due  to  the  presence  of  active  slots  in  each  period. 
Let  the  number  of  idle  slots  in  period  i  be  m,-,  where  0  <  m,  <  M. 
Again,  C  =  linv^flogj  NM(t))jt 
and 

NM(t)=Y™viNM(t-U) 

=  NM(t  -  td)  +  NM(t  -  td+i)  +  NM(t  -  td+2)  +  ■■■  +  NM(t  -  ij+w-i) 

This  is  a  difference  equation  with  a  characteristic  equation  of  the  form 

1  m  *"*+  £-<<<+»  +  x-(d+V  +  ■■■  +  x-l-'+W"1'. 

The  logarithm  of  the  largest  positive  root  of  the  characteristic  equation  is  the  covert 

channel  capacity.    This  equation  has  to  be  solved  numerically  since  the  equation 

usually  is  of  high  degree. 
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We  restate  the  above  problem  of  determining  NM(t)  as  a  problem  of  finding 
the  partitions  of  a  positive  integer.  The  number  of  possible  message  sequences  that 
can  be  transmitted  is  equal  to  the  total  number  of  permutations  of  each  partition 
(permutation  of  a  multiset)  of  a  positive  integer  that  represents  the  time  allowed  for 
the  transmission  of  allowed  symbol  sequences. 

A  partition  of  a  positive  integer  M  is  a  multiset  of  positive  numbers  that  add 
up  to  M.  We  can  specify  the  multiset  by  specifying  the  multiplicity  of  each  element 
in  it.  A  list  of  nonnegative  numbers  (ji,j$,  •  ■  ■  ,jm)  such  that  Y,Hd  *3>  =  M  te'^s  us 
the  multiplicity  of  each  integer  i  in  the  multiset.  Note  that  M  is  not  the  number  of 
elements  in  the  multiset;  it  is  the  sum  of  the  elements  of  the  multiset. 

There  are  no  known  formulas  that  compute  P(M),  the  total  number  of  partitions 
of  M.  We  find  that  P(M)  can  be  studied  fruitfully  by  means  of  the  technique  of 
generating  functions[5][45].  Such  functions  are  well  documented  in  the  literature  and 
we  will  not  discuss  it  in  this  thesis;  we  refer  the  interested  reader  to  P61ya[45]  and 
Bogart[5]. 
General  Comments 

Our  original  problem  of  finding  the  number  of  sequences  Nni(ti),  given  that 
ti  =  d  +  mi,  where  m,  6  [0  ■  •  ■  M  —  1],  is  equivalent  to  finding  the  sum  of  unique 
permutations  of  the  partitions  of  integer  f;  given  the  parts  be  from  the  set  [d,  d  + 
l,d  +  2,  ....d  +  M  —  1].  By  reducing  our  problem  to  finding  permutations  of  P{M), 
we  have  a  procedure  to  determine  JVjf(t).  The  very  reduction  also  suggests  that  no 
closed  form  solution  exists.  However,  the  generating  functions  can  be  easily  derived 
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Covert  Channel  Capacity  for  d=[10..200],  M=(10,50,100] 
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Figure  5.2.  Covert  Channel  Capacity 
and  the  coefficients  computed  by  writing  simple  routines  to  generate  the  tables  of 
intermediate  values.  Simple  routines  are  presented  in  Nijenhuis[42]. 

In  figure  5.2,  we  estimate  the  covert  channel  capacity  by  finding  the  largest  real 
root  of  the  characteristic  equation.  The  maximum  number  of  idle  slots  in  a  period,  M, 
is  fixed  at  10,  50  or  100  slots.  For  each  of  these,  the  number  of  active  slots  in  a  period 
varies  in  the  range  d  €  [10  ■  •  ■  200],  increasing  by  10  active  slots  at  each  iteration.  As 
can  be  seen  from  figure  5.2,  the  covert  channel  capacity  decreases  with  decreasing 
number  of  idle  slots,  M.  Also,  for  the  same  number  of  idle  slots,  the  covert  channel 
capacity  decreases  with  increasing  number  of  active  slots.  The  capacity  degradation 
is  because  the  active  slots  have  the  effect  of  "delaying"  two  successive  symbols  by  a 
time  interval  tj. 
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In  an  actual  communication  subsystem,  a  similar  analysis  will  help  the  system 
designers  to  choose  the  number  of  active  to  idle  slot  ratio  such  that  the  covert  channel 
capacity  is  within  acceptable  bounds.  It  can  also  help  the  system  designers  determine 
the  length  of  a  cycle  and  the  frequency  with  which  renegotiation  of  transmission 
characteristics  can  be  permitted  to  keep  the  channel  capacity  within  predetermined 
levels. 

5.3     Capacity  Analysis  Using  Mode  Based  Security 

In  a  mode  secure  systems,  resources  such  as  disk  space,  CPU  time,  etc.,  are 
pre-allocated  to  users  at  the  different  security  levels  of  a  multilevel  secure  system  [7]. 
Such  pre-partitioning  of  resources  eliminates  covert  channels,  but  reduces  resource 
utilization  and  reliability.  To  improve  utilization  and  reliability,  a  dynamic  resource 
allocation  strategy  is  adopted.  Resources  are  reallocated  periodically,  taking  the 
system  from  one  "mode"  to  another.  As  long  as  the  system  operates  in  a  mode,  there 
is  no  possibility  of  covert  channels  since  there  is  no  resource  sharing  among  different 
security  levels.  However,  covert  channels  may  exist  when  the  system  undergoes  a 
mode  change.  Browne  suggests  techniques  to  reduce  the  capacity  of  such  covert 
channels  [7]. 

In  mode  based  secure  systems,  the  capacity  of  a  covert  channel  depends  on  two 
factors:  first,  the  frequency  with  which  the  system  changes  its  mode  and  second, 
the  number  of  possible  different  modes  to  which  the  system  can  transition  during  a 
mode  change.    We  can  reduce  the  covert  channel  bandwidth  by  either  limiting  the 
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frequency  of  mode  change  or  limiting  the  number  of  legal  transitions  at  each  mode 
or  both. 

In  our  model,  a  cycle  is  defined  as  the  time  interval  between  changes  in  trans- 
mission characteristics  and  can  be  set  either  to  a  fixed  amount  of  time  or  a  fixed 
number  of  periods.  If  both  the  time  and  the  number  of  periods  are  variable,  in  his 
attempts  to  maximize  the  channel  capacity,  the  sender  will  try  to  force  the  minimum 
of  the  two. 

If  the  cycle  consists  of  a  fixed  number  of  periods,  then  we  can  estimate  the 
channel  capacity  using  the  information  theory  based  technique,  the  difference  being 
that  the  symbols  take  longer  time  to  transmitted.  If  t;  was  the  time  required  to 
transmit  the  symbol  assuming  one  period  per  cycle,  we  will  now  require  Pi,  time  to 
transmit  the  same  symbol,  where  P  is  the  number  of  periods  per  cycle.  This  implies 
that  the  channel  capacity  will  be  reduced  by  a  factor  of  P. 

Let  us  consider  the  case  when  the  cycle  length  is  fixed  and  does  not  depend  on 
the  number  of  periods.  Let  Tc  be  the  length  of  a  cycle  and  let  Tc  S>  Tp,  where  Tp  is 
the  length  of  a  period.  M  —  1  is  the  maximum  number  of  idle  slots  allowable  in  any 
period  and  m  S  [0  •  ■  •  M  —  1].  Time  is  normalized  to  slot  times. 
Let  tm  denote  the  time  required  to  transmit  a  symbol  m.  Then  tm  ss  Tc,  which 
implies  tm  as  tmi,  for  all  symbols  m,m'.  The  channel  capacity  C  is  simply  -^J — ' 

5.4     Conclusion 

In  this  chapter,  we  presented  a  formal  and  an  informal  method  to  estimate  covert 
channel  capacity.  The  formal  method  was  based  on  Millen's  application  of  Shannon's 
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information  theory  to  estimate  covert  channel  capacity.  The  informal  method  was 
based  on  Browne's  mode  security  systems.  We  compute  the  network  covert  channel 
capacity  by  determining  the  number  of  message  sequences  that  can  be  transmitted. 
Using  the  adaptive  scheduling  policy,  a  (covert  message)  symbol  is  encoded  by  the 
number  of  idle  slots  in  a  period.  We  derive  simple  formulations  to  estimate  the 
number  of  symbol  sequences  that  can  be  transmitted  when  the  maximum  number  of 
idle  slots  is  M  and  suggest  a  technique  to  estimate  the  number  of  symbol  sequences 
when  the  number  of  active  slots  in  a  period  is  d. 

In  the  next  Chapter,  we  discuss  the  auditability  of  covert  channels.  In  sec- 
tion 6.3,  we  estimate  the  covert  channel  capacity  for  noiseless  channels;  in  section  6.4, 
we  repeat  the  exercise  for  noisy  channels.  In  section  6.5,  we  discuss  various  handling 
policies  to  reduce  the  covert  channel  capacity  and  in  section  6.6,  we  discuss  the  factors 
affecting  the  capacity  and  auditability  of  network  covert  channels. 


CHAPTER  6 
AUDITABILITY  OF  COVERT  CHANNELS 


6.1     Introduction 

An  effective  method  of  handling  known  covert  channels  is  to  deter  its  potential 
users.  The  existence  of  the  covert  channel  is  known  to  users,  who  may  attempt 
to  exploit  the  channel  but  the  deterrence  mechanism  discourages  such  channel  use. 
Covert  channel  auditing  is  the  main  deterrence  mechanism  and  is  effective  only  when 
the  covert  channel  use  can  be  detected  unambiguously,  i.e.,  discovery  of  covert  channel 
use  must  be  certain.  Covert  channel  auditing  must  not  be  circumventable,  and  false 
detection  of  covert  channel  use  must  be  avoided[50]. 

6.2     Auditable  Covert  Channels 

The  goal  of  audit  analysis  is  to  detect  potential  covert  channel  usage  by  detecting 
unusual  usage  of  resources.  The  difficulty  in  auditing  covert  channels  is  distinguishing 
covert  channel  use  from  "normal"  operation  of  the  system. 

Our  measurements  on  ECSNET  (Engineering  Consulting  Services  Network,  a 
subnet  on  UFNET)  showed  that  on  a  typical  day,  19,888,635  packets  were  exchanged 
over  a  15  hour  period,  the  mean  packet  size  being  291  bytes.  Since  there  are  25 
nodes  in  the  network,  the  mean  number  of  packets  transmitted  per  node  per  minute 
is  884  packets.    We  also  observed  the  maximum  load  on  the  network  to  be  35,050 
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packets  per  minute  or  1,402  packets  per  node  per  minute  and  the  minimum  load  to 
be  2,680  packets  per  minute  or  107  packets  per  node  per  minute.  We  will  use  these 
traffic  measurements  to  determine  the  covert  channel  capacity  and  propose  audit 
mechanisms. 

Figure  6.1  shows  the  distribution  of  the  percentage  change  in  traffic  volume  over 
one  minute  intervals.  The  distribution  fits  very  well  with  a  normal  distribution,  also 
shown  in  the  figure.  We  accept  approximately  95%  of  the  variations  in  traffic  char- 
acteristics to  be  "normal"  and  exclude  them  from  scrutiny.  Two  standard  deviations 
for  the  distribution  shown  in  figure  6.1  is  a  23.95  percentage  change  in  traffic  volume. 
Therefore  we  set  the  threshold  for  auditability,  9,  to  be  24%,  i.e.,  any  variation  in 
the  traffic  volume  that  is  at  least  24%  of  the  current  volume  is  audited.  Based  on  the 
above  measurements,  a  5%  audit  factor  will  generate  45  events  over  a  15  hour  period 
for  the  network.  With  25  nodes  in  the  network  and  assuming  100  bytes  per  audit 
record,  the  audit  log  can  grow  as  large  as  100k  bytes  just  for  ECSNET  subnetwork. 
If  measurements  are  done  at  a  smaller  granularity,  or  if  the  system  load  varies  con- 
siderably over  the  measurement  duration,  then  the  audit  logs  can  grow  much  larger. 
In  such  cases,  a  5%  audit  factor  might  be  too  high  from  a  practical  standpoint. 

For  our  model,  we  must  estimate  traffic  in  terms  of  slot  times  based  on  our 
traffic  observations  and  performance  evaluation  of  the  spatial  neutrality  model.  Un- 
der the  given  environment  and  load  conditions,  our  approach  has  an  overhead  of  a 
factor  of  four  to  achieve  spatial  and  temporal  neutrality  (see  performance  results  in 
section  7.3).   Hence  unaudited  channel  capacities  will  be  based  on  loads  four  times 


87 


Change  in  Traffic  Volume,  1  minute  intervals 


Traffic  Monitored  Over  1 5  Hour  Period 
Normal  Distribution    "*"* 


-10  0  10 

%  Change  in  Traffic  Volume 

Figure  6.1.  Percentage  Change  In  Volume:  ECSNET  Traffic 
the  observed  load.  ECSNET  is  a  10  Mbps  LAN;  with  average  packet  size  being  291 
bytes,  we  derive  the  slot  transmission  time  as  0.23  msec.  With  sufficient  guard  band 
and  allowance  for  other  overhead  included,  we  take  the  effective  slot  time  to  be  0.5 
msec.  Since  the  number  of  nodes  in  the  network,  n,  is  25,  our  model  prescribes  a 
minimum  of  n(n  —  1)  =  600  active  slots  in  a  period.  At  peak  load  conditions,  the 
period  consists  entirely  of  active  slots  with  a  period  length  of  0.3  seconds  or  200  pe- 
riods per  minute.  At  this  rate,  the  maximum  sustainable  packet  transmission  rate  is 
4800  packets  per  node  per  minute.  A  detailed  discussion  is  presented  in  section  6.3.3. 
In  section  6.3,  we  estimate  the  capacity  of  auditable  covert  channels  assuming 
that  the  channel  is  noiseless  and  suggest  simple  but  effective  handling  techniques. 
This  is  followed  by  a  similar  analysis  for  noisy  channels  in  section  6.4.  Note  that  this 
is  an  informal  estimate  of  channel  capacity  in  which  we  assume  fixed  cycle  length.  A 


88 

formal  analysis  of  channel  capacity  will  require  an  information  theory  based  approach, 
similar  to  the  capacity  analysis  in  section  5.2,  with  the  time  required  to  make  a 
transition,  (,j,  depending  on  the  current  state  i  and  the  number  of  cycles  needed 
to  transition  to  state  j.  The  set  of  allowed  symbols  and  the  number  of  possible 
transitions  will  depend  on  factors  such  as  the  current  system  load,  the  slot  size,  etc. 
Such  an  analysis  is  very  involved  even  for  simple  cases  and  we  will  not  pursue  it  in 
this  thesis.  The  informal  technique  to  compute  the  channel  capacities  of  auditable 
covert  channels  with  and  without  handling  is  shown  in  Figure  6.2. 

6.3     Auditabilitv  of  Noiseless  Covert  Channels 

In  this  section,  we  discuss  the  auditing  of  covert  channels  for  minimum,  average 
and  maximum  traffic  load  cases.  We  assume  that  the  channel  is  noiseless,  i.e.,  any 
changes  to  the  traffic  load  is  caused  only  by  the  sender  and  that  the  sender  has 
the  maximum  effect  over  changing  the  traffic  volume  over  successive  periods.  Other 
nodes  in  the  network  contribute  a  constant  volume  of  traffic.  To  derive  the  maximum 
possible  covert  channel  capacity,  the  sender  is  assumed  to  operate  at  its  maximum 
transmission  capacity. 

We  also  assume  that  there  are  25  nodes  in  the  network,  the  slot  time  is  0.5  msec 
and  the  cycle  time,  Tc,  is  one  minute.  In  order  to  compute  the  channel  capacity  for  a 
given  traffic  load,  we  determine  the  maximum  and  minimum  number  of  idle  slots  in 
a  period  for  a  —24%  and  a  +24%  change  in  the  traffic  volume  respectively.  For  each 
audited  channel,  we  also  derive  the  channel  capacity  with  and  without  handling.  The 
covert  channel  handling  methods  reduce  channel  capacities  by  reducing  the  number 
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Procedure  Audit(); 
Input: 

lbase:  base  system  load  (packets/node/minute) 
n,  number  of  nodes  in  the  network 
si,  slot  time  in  seconds 

0,  the  audit  threshold  as  percent  change  in  traffic 
8,  granularity  of  change  in  the  number  of  idle  slots 
Output: 

Channel  Capacity,  bps 

hase  =  4  x  lbase;        I*  a  factor  of  four  overhead  due  to  achieve  neutrality  */ 
period-length  =  hase/(n  —  1); 

/*  each  node  exchanges  a  packet  with  every  other  node*/ 
compute  periodJength  in  seconds  per  period; 
slots-per-period=periodJength/sl; 
idle^slotsbaae  =  slots -per .period  —  active  slots 

I*  n(n  —  1)  active  slots  per  period  */ 
if  channel  is  noiseless,  then 

l(+G)   —  'base  T  lbase   *  "i 

compute  idleslot^gy, 

/(-0)  =  hase  —  hase   *  $i 

compute  idleslot^gy, 
endif 

if  channel  is  noisy,  then 


compute  idleslot(+S) 


compute  idle^slot{_)) 
endif 

statesmax  =  idleslot^-e)  ~  idlc^slot^y, 
Channel  Capacity,  Cmax  =  "^'"'"p""*       bps 

if  6  is  fixed,  then  6  =  a  x  slots  jper-period 

statess  =  \state°"«"] 

Channel  Capacity,  Cs  =  lo«2  (;'«"">  bps 
endif 
if  S  is  fraction  of  idleslotsbase,  then 

_   ,  [{idU-slotsbasf)-(idle-sloii+e))~l  ,  ,  ^  {idle  slots^ _S))-(idle-slotbale) -) , 

Channel  Capacity,  Cs  =  '°8a(a6'°'e'")  bps 
endif 

Figure  6.2.  Procedure  to  Compute  Auditable  Covert  Channel  Capacities 
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of  states  to  which  the  system  can  transition.  This  reduction  in  the  number  of  states 
is  achieved  by  increasing  the  granularity  of  change  in  the  number  of  idle  slots  in  a 
period.  We  discuss  cases  where  the  granularity  is  fixed  at  ±10,  ±50,  ±100  and  when 
the  granularity  is  a  fraction  of  the  total  number  of  slots  in  the  period.  This  gives 
us  the  range  of  the  number  of  idle  slots  within  a  period  and  knowing  the  handling 
policy,  we  can  determine  the  number  of  states  that  are  possible.  Once  we  know  the 
number  of  states,  we  can  easily  determine  the  covert  channel  capacity. 

6.3.1     Auditing  Noiseless  Covert  Channels  for  Average  Load 

The  number  of  actual  packets  transmitted  on  an  average  is  884  packets  per  node 
per  minute;  using  our  approach  the  number  of  packets  exchanged  is  884  X  4  =  3536 
packets  per  node  per  minute.  Since  each  node  exchanges  one  packet  with  every  other 
node  in  the  network  each  period,  we  have  3536/24  =  147.3  periods  per  minute  or 
0.407  seconds  per  period.  Assuming  slot  time  to  be  0.0005  seconds  per  slot,  we  have 
0.407/0.0005  =  814  slots  per  period.  Since  the  number  of  active  slots  per  period  is 
n(n  —  1)  =  600,  the  number  of  idle  slots  per  period  is  214  for  this  load  level. 

Since  we  accept  any  variation  less  than  ±24%  as  normal,  we  now  estimate  the 
capacity  for  a  variation  of  ±24%.  A  24%  change  in  traffic  volume  yields  3536  ±  849 
packets  per  node  per  minute.  Following  the  computations  shown  above,  3536  ±849  = 
4385  packets  per  node  per  minute.  This  yields  4385/24  =  182.71  periods  per  minute 
or  0.328  seconds  per  period  for  0.3284/0.0005  =  657  slots  per  period  with  57  idle 
slots. 
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Similarly,  3536-849  =  2687  packets  per  node  per  minute.  This  yields  2687/24  = 
111.958  periods  per  minute  or  0.5359  seconds  per  period  for  0.5359/0.0005  =  1072 
slots  per  period  with  472  idle  slots. 

Therefore  the  range  of  idle  slots  is  472  -57+1   =  416  states.    This  can  be 
encoded  by  log2416  =  9  bits.   Therefore  the  capacity  of  this  channel  is  9/60  =  0.15 
bps. 
Handling  Policy 

1.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±100 
from  the  previous  period,  then  there  are  5  states,  needing  3  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  3/60  =  0.05  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  S  —  ±50 
from  the  previous  period,  then  there  are  9  states,  needing  4  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  4/60  =  0.06  bps. 

3.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±10 
from  the  previous  period,  then  there  are  42  states,  needing  6  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  6/60  =  0.1  bps. 

4.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  S,  is 
a  fraction,  a,  of  the  basic  number  of  slots  in  the  period,  then  we  have  8  = 
a  x   slots. 

For  example,  if  a  =  0.1,  then  S  =  0.1x814  =81.  To  determine  the  total  number 
of  states,  we  compute  the  number  of  states  for  ±24%  variation  in  traffic.  When 
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the  variation  in  traffic  is  +24%,  the  number  of  idle  slots  is  57.  Therefore  the 
number  of  states  in  the  positive  direction  is  f21^57]  =  2.  Similarly,  when  the 
variation  in  traffic  is  -24%,  the  number  of  idle  slots  is  472.  Therefore  the 
number  of  states  in  the  negative  direction  is  I'472"214]  =  4.  Therefore  the  total 
number  of  states  is  2+4+1=7.  This  can  be  encoded  by  log2  7  =  3  bits,  giving 
a  channel  capacity  of  0.05  bps. 

6.3.2     Auditing  Noiseless  Covert  Channels  for  Minimum  Load 

The  minimum  load  was  observed  to  be  2680  packets  per  minute  or  107  packets 
per  node  per  minute.  Using  our  approach,  there  would  be  4  X  107  =  428  packets  per 
node  per  minute.  This  gives  428/24  =  17.83  periods  per  minute  or  3.364  seconds  per 
period.  With  a  slot  time  of  0.0005  seconds,  we  have  6728  slots  per  period.  With  600 
active  slots  per  period,  the  number  of  idle  slots  per  period  is  6128. 

A  24%  change  in  traffic  volume  yields  428  ±  103  packets  per  node  per  minute. 
Following  the  computations  shown  above,  428  +  103  =  531  packets  per  node  per 
minute.  This  yields  531/24  =  22.13  periods  per  minute  or  2.711  seconds  per  period 
for  2.711/0.0005  =  5422  slots  per  period,  with  4822  idle  slots. 

Similarly,  428  —  103  =  325  packets  per  node  per  minute.  This  yields  325/24  = 
13.54  periods  per  minute  or  4.43  seconds  per  period  for  4.43/0.0005  =  8860  slots  per 
period  with  8260  idle  slots. 

Therefore  the  range  of  idle  slots  is  8260  —  4822  +  1  =  3439.  This  can  be  encoded 
by  log2  3439  =  12  bits.  Therefore  the  capacity  of  this  channel  is  12/60  =  0.2  bps. 
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Handling  Policy 

1.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  S  =  ±100 
from  the  previous  period,  then  there  are  35  states,  needing  6  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  6/60  =  0.1  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  S  =  ±50 
from  the  previous  period,  then  there  are  69  states,  needing  7  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  7/60  =  0.12  bps. 

3.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±10 
from  the  previous  period,  then  there  are  344  states,  needing  9  bits  to  encode 
the  states.  This  yields  a  channel  capacity  of  9/60  =  0.15  bps. 

4.  For  the  proportional  method  with  a  =  0.1,  S  =  0.1  x  6728  =  672.  When 
the  variation  in  traffic  is  +24%,  the  number  of  idle  slots  is  78.  Therefore  the 
number  of  states  in  the  positive  direction  is  [612|~^822]  =  2.  Similarly,  when 
the  variation  in  traffic  is  —24%,  the  number  of  idle  slots  is  8260.  Therefore 
the  number  of  states  in  the  negative  direction  is  |"826°~|128]  =  4.  Therefore  the 
total  number  of  states  is  7.  This  can  be  encoded  by  log2  7  =  3  bits,  giving  a 
channel  capacity  of  0.05  bps. 

6.3.3     Auditing  Noiseless  Covert  Channels  for  Maximum  Load 

Now  consider  the  case  of  maximum  network  traffic.  The  total  number  of  packets 
exchanged  is  1402  packets  per  node  per  minute.  Using  our  approach,  there  would 
be  4  X  1402  =  5608  packets  per  node  per  minute.   This  gives  us  5608/24  =  233.66 
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periods  per  minute  or  0.257  seconds  per  period.  With  a  slot  time  of  0.0005  seconds 
per  slot,  we  have  513  slots  per  period. 

A  period  with  513  slots  is  smaller  than  the  minimum  period  length  allowed  by 
our  model.  The  model  specifies  that  there  are  at  least  n(n  —  1)  =  600  active  slots  in 
a  period,  implying  a  maximum  of  200  periods  per  minute.  At  this  rate,  the  system 
can  support  at  most  4800  packets  per  node  per  minute.  Therefore  5608  —  4800  =  808 
packets  per  node  per  minute  is  backlogged  by  the  transmission  schedule  for  later 
transmission. 

To  avoid  boundary  effects,  we  will  use  4800  -  24%  =  3648  packets  per  node  per 
minute  as  the  effective  maximum  load.  For  this  load,  we  have  3648/24  =  152  periods 
per  minute  or  0.39  seconds  per  period  for  790  slots  per  period  with  190  idle  slots  per 
period. 

A  24%  change  in  traffic  volume  yields  3648  ±  876  packets  per  node  per  minute. 
Following  the  computations  shown  above,  3648  +  876  =  4524  packets  per  node  per 
minute.  This  yields  4524/24  =  189  periods  per  minute  or  0.32  seconds  per  period  for 
0.32/0.0005  =  640  slots  per  period,  with  40  idle  slots. 

Similarly,  3648-876  =  2772  packets  per  node  per  minute.  This  yields  2772/24  = 
115.5  periods  per  minute  or  0.52  seconds  per  period  for  0.52/0.0005  =  1040  slots  per 
period  with  440  idle  slots. 

Therefore  the  range  of  idle  slots  is  440  —  40  +  1  =401.  This  can  be  encoded  by 
log2401  =  9  bits.  Therefore  the  capacity  of  this  channel  is  /60  =  0.15  bps. 
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Handling  Policy 

1.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±100 
from  the  previous  period,  then  there  are  5  states,  needing  3  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  3/60  =  0.05  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±50 
from  the  previous  period,  then  there  are  9  states,  needing  4  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  4/60  =  0.07  bps. 

3.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  S  =  ±10 
from  the  previous  period,  then  there  are  42  states,  needing  6  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  6/60  =  0.1  bps. 

4.  For  the  proportional  method  with  a  =  0.1,  6  =  0.1  X  790  =  79.  When  the 
variation  in  traffic  is  ±24%,  the  number  of  idle  slots  is  40.  Therefore  the 
number  of  states  in  the  positive  direction  is  |"19°~40]  =  2.  Similarly,  when  the 
variation  in  traffic  is  —24%,  the  number  of  idle  slots  is  440.  Therefore  the 
number  of  states  in  the  negative  direction  is  |"440^190]  =  4.  Therefore  the  total 
number  of  states  is  7.  This  can  be  encoded  by  log2  7  =  3  bits,  giving  a  channel 
capacity  of  0.05  bps. 

Discussion  of  Noiseless  Channel  Capacity 

Table  6.1  shows  the  covert  channel  capacity  for  a  noiseless  channel  with  and 
without  handling.  We  see  that  the  period  length  is  inversely  proportional  to  the 
network  load.    As  the  load  increases,  the  number  of  idle  slots  per  period  reduces 
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Tabic  6.1.  Channel  Capacities  for  a  Noiseless  Channel 


Minimum  Load 

Average  Load 

Maximum  Load 

Basic  Load 

load,  pkt/node/min 

428 

3536 

3648 

#  slots 

6728 

814 

790 

#  idle  slots 

(,  1 28 

214 

190 

Unaudited  Channels 

-24%  load 

325 

2687 

2772 

+24%  load 

531 

4385 

4524 

#  idle  slots  for  -24%  load 

8260 

472 

440 

#  idle  slots  for  +24%  load 

4822 

57 

40 

Range  of  idle  slots 

3439 

416 

101 

Channel  Capacity,  bps 

0.20 

0.15 

0.15 

Channel  Handling 

idle  ±100,  Cap.  bps 

0.1 

0.05 

0.0.5 

idle  ±50,  Cap.  bps 

0.12 

0.06 

0.07 

idle  ±10,  Cap.  bps 

0.15 

0.1 

0.1 

S  =  0.1  x  i  slots 

672 

81 

79 

#  states  in  ±24%  load 

7 

7 

7 

(5  =  0.1  X  #  slots,  Cap.  bps 

0.05 

0.05 

0.05 

thereby  reducing  period  length.  A  shorter  period  yields  a  greater  fraction  of  ac- 
tive slots  and  therefore  greater  utilization  of  transmission  capacity.  Since  the  covert 
channel  capacity  depends  on  the  maximum  number  of  idle  slots  per  period,  it  is  clear 
that  with  increasing  load,  i.e.,  with  increasing  effective  utilization,  the  covert  channel 
capacity  reduces. 

The  handling  policies  reduce  the  number  of  states  to  which  the  system  can  transi- 
tion by  varying  the  granularity  of  change  in  number  of  idle  slots.  With  increasing 
granularity,  the  number  of  possible  states  decreases,  reducing  covert  channel  capacity. 
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6.4     Auditabilitv  of  Noisy  Covert  Channels 

In  a  noisy  channel,  all  nodes  in  the  network  affect  the  system  load.  Each  node 
in  the  network  contributes  to  the  system  load  and  any  change  to  the  system  load 
is  the  aggregate  of  changes  to  individual  node  traffic.  If  the  sender  node  wishes  to 
vary  the  system  load,  then  this  change  is  distributed  to  every  node  in  the  network, 
thereby  reducing  the  effective  variation  in  the  traffic  volume  thus  introducing  noise. 
We  model  this  by  adding  the  increased  load  from  the  sender  node  to  the  total  traffic; 
doing  so  distributes  the  load  variation  over  every  node  in  the  network.  We  assume 
that  the  traffic  due  to  the  other  nodes  remains  unchanged  and  other  parameters  like 
the  number  of  nodes  in  the  network,  slot  time,  etc.  remain  the  same  as  in  the  analysis 
of  noiseless  covert  channels. 

6.4.1     Auditing  Noisy  Covert  Channels  for  Average  Load 

From  our  analysis  of  the  average  load  case  for  noiseless  channels,  we  know  that 
the  basic  load  on  the  network  is  3536  packets  per  node  per  minute  and  the  number 
of  idle  slots  per  period  is  214  for  this  load  level. 

A  24%  change  in  traffic  volume  results  in  a  2||£  change  in  each  node  in  the 
network.  Following  the  computations  shown  above,  this  yields  3536  +  849/25  =  3570 
packets  per  node  per  minute  for  3570/24  =  148.75  periods  per  minute  or  0.40  seconds 
per  period.  Assuming  slot  time  to  be  0.0005  seconds  per  slot,  we  have  0.40/0.0005  = 
800  slots  per  period  with  200  idle  slots. 
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Similarly,  3536  -  849/25  =  3502  packets  per  node  per  minute.  This  yields 
3502/24  =  146.92  periods  per  minute  or  0.41  seconds  per  period  for  0.41/0.0005  =  820 
slots  per  period  with  220  idle  slots. 

Therefore  the  range  of  idle  slots  is  220  -  200  +  1  =  21.  This  can  be  encoded  by 
log2  21  =  5  bits.  Therefore  the  capacity  of  this  channel  is  5/60  =  0.08  bps. 
Handling  Policy 

1.  If  we  assume  that  the  cycle  length  is  one  minute  and  the  granularity  of  change 
in  the  number  of  idle  slots,  8  =  ±50  or  <5  =  ±100,  then  the  number  number 
of  states  is  restricted  to  2,  needing  1  bit  to  encode  the  states.  Therefore  the 
channel  capacity  is  1.0/60  =  0.02  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  8  =  ±10 
from  the  previous  period,  then  there  are  3  states,  needing  2  bits  to  encode  the 
state.  This  yields  a  channel  capacity  of  2.0/60  =  0.03  bps. 

3.  For  the  proportional  method  with  a  =  0.1,  8  =  0.1  x  814  =  81.  When  the 
variation  in  traffic  is  +24%,  the  number  of  idle  slots  is  78.  Therefore  the 
number  of  states  in  the  positive  direction  is  [2U~12001  =  1-  Similarly,  when 
the  variation  in  traffic  is  —24%,  the  number  of  idle  slots  is  220.  Therefore  the 
number  of  states  in  the  negative  direction  is  f220"214]  =  1 .  Therefore  the  total 
number  of  states  is  3.  This  can  be  encoded  by  log2  3  =  2  bit,  giving  a  channel 
capacity  of  0.03  bps. 
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6.4.2     Auditing  Noisy  Covert  Channels  for  Minimum  Load 

From  our  analysis  of  the  minimum  load  case  for  noiseless  channels,  we  know  that 
the  basic  load  on  the  network  is  428  packets  per  node  per  minute  and  the  number  of 
idle  slots  per  period  is  6128. 

A  24%  change  in  traffic  volume  results  in  a  2|&  change  in  each  node  in  the 
network.  Following  the  computations  shown  above,  428  +  103/25  =  432  packets  per 
node  per  minute.  This  yields  432/24  =  18.0  periods  per  minute  or  3.33  seconds  per 
period  for  3.33/0.0005  =  6667  slots  per  period  with  6067  idle  slots. 

Similarly,  428-103/25  =  424  packets  per  node  per  minute.  This  yields  424/24  = 
17.67  periods  per  minute  or  3.396  seconds  per  period.  Assuming  slot  time  to  be  0.0005 
seconds  per  slot,  we  have  3.396/0.0005  =  6792  slots  per  period  with  6192  idle  slots. 

Therefore  the  range  of  idle  slots  is  6192  —  6067  +  1  =  126.  This  can  be  encoded 
by  log2 126  =  7  bits  to  encode  the  states.   Therefore  the  capacity  of  this  channel  is 
7/60  =  0.12  bps. 
Handling  Policy 

1.  If  we  assume  that  the  cycle  length  is  one  minute  and  the  granularity  of  change 
in  the  number  of  idle  slots,  6  =  ±100  from  the  previous  period,  then  there  are 
2  states,  needing  1  bit  to  encode  the  states.  This  yields  a  channel  capacity  of 
1/60  =  0.01  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  5  =  ±50 
from  the  previous  period,  then  there  are  3  states,  needing  2  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  2/60  =  0.03  bps. 
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3.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±10 
from  the  previous  period,  then  there  are  13  states,  needing  4  bits  to  encode  the 
states.  This  yields  a  channel  capacity  of  4/60  =  0.07  bps. 

4.  For  a  =  0.1,  S  =  0.1  X  6728  =  673.  When  the  variation  in  traffic  is  +24%, 
the  number  of  idle  slots  is  6067.  Therefore  the  number  of  states  in  the  positive 
direction  is  [612|~|067]  =  1.  Similarly,  when  the  variation  in  traffic  is  —24%, 
the  number  of  idle  slots  is  6192.  Therefore  the  number  of  states  in  the  negative 
direction  is  [619g~^128]  =  1.  Therefore  the  total  number  of  states  is  3.  This  can 
be  encoded  by  log2  3  =  2  bits,  giving  a  channel  capacity  of  0.03  bps. 

6.4.3     Auditing  Noisy  Covert  Channels  for  Maximum  Load 

From  our  analysis  of  the  maximum  load  case  for  noiseless  channels,  we  know  that 
the  basic  load  on  the  network  is  5608  packets  per  node  per  minute  or  5608/24  =  233.66 
periods  per  minute  or  0.257  seconds  per  period.  With  a  slot  time  of  0.0005  seconds 
per  slot,  we  have  514  slots  per  period. 

For  the  reasons  discussed  in  section  5.6.3,  we  will  use  4800  —  24%  =  3648 
packets  per  node  per  minute  as  the  effective  maximum  load.  From  our  analysis 
of  the  maximum  load  case  for  noiseless  channels,  we  we  have  3648/24  =  152  periods 
per  minute  or  0.39  seconds  per  period  or  790  slots  per  period  with  190  idle  slots  per 
period. 

A  24%  change  in  traffic  volume  results  in  a  ^p  change  in  each  node  in  the 
network.   Following  the  computations  shown  above,  3648  +  876/25  =  3684  packets 
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per  node  per  minute.  This  yields  3684/24  =  153.5  periods  per  minute  or  0.39  seconds 
per  period  for  0.39/0.0005  =  780  slots  per  period  with  180  idle  slots. 

Similarly,  3648  —  876/25  =  3612  packets  per  node  per  minute.  This  yields 
3612/24  =  150.5  periods  per  minute  or  0.398  seconds  per  period.  Assuming  slot  time 
to  be  0.0005  seconds  per  slot,  we  have  0.398/0.0005  =  796  slots  per  period  with  196 
idle  slots.  Therefore  the  range  of  idle  slots  in  a  period  is  196  — 180+ 1  =  17.  This  can 
be  encoded  by  log2 17  =  5  bits.  Therefore  the  capacity  of  this  channel  is  0.08  bps. 
Handling  Policy 

1.  If  we  assume  that  the  cycle  length  is  one  minute  and  the  granularity  of  change 
in  the  number  of  idle  slots,  S  =  ±50  or  S  =  ±100  from  the  previous  period, 
then  the  number  number  of  states  is  restricted  to  2,  needing  1  bit  to  encode 
the  state.  This  yields  a  channel  capacity  is  1/60  =  0.02  bps. 

2.  If  we  assume  that  the  granularity  of  change  in  the  number  of  idle  slots,  6  =  ±10 
from  the  previous  period,  then  there  are  2  states,  needing  1  bit  to  encode  the 
state.  This  yields  a  channel  capacity  of  1/60  =  0.02  bps. 

3.  For  the  proportional  method  with  a  =  0.1,  6  =  0.1  x  790  =  79.  When  the 
variation  in  traffic  is  +24%,  the  number  of  idle  slots  is  180.  Therefore  the 
number  of  states  in  the  positive  direction  is  [1907^180]  =  1.  Similarly,  when 
the  variation  in  traffic  is  —24%,  the  number  of  idle  slots  is  196.  Therefore  the 
number  of  states  in  the  negative  direction  is  |"196^190]  =  1. 

Therefore  the  total  number  of  states  is  3.   This  can  be  encoded  by  log2  3  =  2 
bits,  giving  a  channel  capacity  of  0.03  bps. 
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Table  6.2.  Channel  Capacities  for  a  Noisy  Channel 


Minimum  Load 

Average  Load 

Maximum  Load 

Basic  Load 

load,  pkt/node/min 

428 

3536 

3648 

#  slots 

6728 

814 

790 

#  idle  slots 

6128 

214 

190 

Unaudited  Channels 

-24%/25  load 

121 

3502 

3612 

+24%/25  load 

132 

3570 

2684 

#  idle  slots  for  -24%/25  load 

6192 

220 

196 

#  idle  slots  for  ±24%/25  load 

6067 

200 

ISO 

Range  of  idle  slots 

126 

21 

17 

Channel  Capacity,  bps 

0.12 

0.08 

0.08 

Channel  Handling 

idle  ±100,  Cap.  bps 

0.01 

0.02 

0.02 

idle  ±50,  Cap.  bps 

0.03 

0.02 

0.02 

idle  ±10,  Cap.  bps 

0.07 

0.03 

0.02 

S  =  0.1  X   slots 

673 

81 

79 

#  states  in  ±24%  load 

3 

3 

3 

f  =  oxf  slots.  Cap.  bps 

0.03 

0.03 

0.03 

Discussion  of  Noisy  Channel  Capacity 

Table  6.2  shows  the  channel  capacity  for  a  noisy  channel.  A  comparison  with  the 
table  6.1  shows  that  the  channel  capacity  for  a  noisy  channel  is  less  than  the  channel 
capacity  of  noiseless  channels  for  corresponding  traffic  loads.  This  is  because  the 
24%  change  in  traffic  volume  is  distributed  to  every  node  in  the  network.  Therefore 
the  effective  change  in  the  traffic  volume  and  consequently  the  range  of  variation  in 
the  number  of  idle  slots  is  reduced  leading  to  reduced  channel  capacity.  As  in  the 
noiseless  channel  case,  the  handling  policies  reduce  the  number  of  states  to  which  the 
system  can  transition  and  therefore  reduces  the  maximum  covert  channel  capacity. 
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Auditable  Channel  Capacity  with  and  without  handling  {Granularity  =  0.1  *  #slots) 
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Figure  6.3.  Noiseless  Covert  Channel  for  Different  0 
6.5     Handling  Policies 

Figure  6.3  shows  the  channel  capacities  for  different  auditability  thresholds,  0. 
The  top  three  curves  represent  the  channel  capacities  for  a  noiseless  channel  without 
handling  for  minimum  load,  average  and  effective  maximum  load,  and  the  actual 
maximum  load  as  indicated.  The  channel  capacities  for  the  average  and  the  effective 
maximum  load  are  almost  the  same  because  the  load  in  both  cases  is  almost  the 
same  (see  Table  6.4.3).  From  the  figure,  we  can  see  that  as  the  auditability  threshold 
increases,  the  variability  in  the  system  load  that  is  accepted  as  "normal"  increases 
leading  to  higher  covert  channel  capacities.  Channel  capacity  also  depends  on  the 
total  load  on  the  system.  In  a  lightly  loaded  system,  each  period  contains  a  large 
number  of  idle  slots  and  therefore  the  potential  covert  channel  has  high  capacity.  On 
the  other  hand,  under  maximum  load  conditions,  there  are  very  few  idle  slots,  if  any, 
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Figure  6.4.  Noiseless  Covert  Channel  Capacities  for  Different  Load 
in  each  period.  Therefore  the  channel  capacity  is  lower.  The  2cr,  3cr  and  4a  values  of 
covert  channel  capacity  are  marked;  as  expected,  by  accepting  a  larger  variation  in 
the  traffic  change  as  normal,  we  allow  a  potential  covert  channel  of  larger  capacity. 

The  lower  three  curves  represent  the  channel  capacities  for  a  noiseless  channel 
with  handling  for  minimum,  average  and  maximum  loads.  Using  the  proportional 
handling  policy  where  the  granularity  of  change  in  the  number  of  idle  slots,  S,  is  a 
fraction,  a,  of  the  basic  number  of  slots  in  the  period,  then  we  have  S  =  a  x  slots. 
In  this  case,  we  choose  a  =  0.1.  We  see  that  this  handling  technique  reduces  covert 
channel  capacity  by  more  than  50  percent  compared  to  the  corresponding  channel 
with  no  handling.  Note  also  that  the  channel  capacity  after  handling  in  the  average 
and  minimum  load  conditions  is  almost  the  same.  This  is  due  to  the  quantization 
effects  caused  by  the  granularity  of  change  in  the  number  of  idle  slots. 
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Figure  6.4  shows  the  effect  of  different  handling  policies  at  different  load  con- 
ditions on  the  covert  channel  capacity  for  a  noiseless  channel.  From  the  figure,  we 
can  see  that  as  the  load  increases,  the  covert  channel  capacity  reduces.  We  also  note 
that  as  the  granularity  of  change  in  the  number  of  idle  slots  increases,  the  channel 
capacity  decreases.  The  solid  curve  represents  the  case  when  no  handling  mechanism 
is  used.  We  see  that  proportional  handling  is  more  effective  than  other  simpler  han- 
dling policies.  The  covert  channel  capacity  when  a  =  0.2  is  constant  at  0.03  bps  and 
does  not  depend  on  the  system  load. 

6.6     Factors  Affecting  Channel  Capacity  and  Auditabilitv 

The  factors  affecting  the  capacity  and  auditability  of  covert  channels  are  the 
system  load,  the  cycle  length,  the  granularity  of  change  in  the  number  of  idle  slots 
and  the  auditability  threshold.    We  will  briefly  discuss  the  effects  of  each  of  these 
factors  below. 
System  load 
Channel  Capacity  ex  1 /System  load. 

As  the  system  load  increases,  the  number  of  packets  exchanged  increases.  This 
has  the  effect  of  reducing  the  number  of  idle  slots  per  period.  The  number  of  active 
slots  per  period  is  n(n  —  1)  to  guarantee  spatial  neutrality.  A  lower  bound  on  the 
number  of  slots  per  period  and  consequently  on  the  number  of  periods  per  minute  or 
the  period  length  is  imposed  when  the  period  consists  of  only  active  slots  and  zero 
idle  slots.  In  this  case,  the  capacity  of  the  system  is  completely  assigned  and  any 
excess  load  is  backlogged  for  future  transmission. 
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Assuming  n  =  25  nodes  in  the  network,  i.e.,  n(n  -  1)  =  600;  TBiot  =  0.0005 
msec  and  Tc  =  1  minute,  the  maximum  sustainable  load  as  derived  in  section  5.6.3 

■         60x24       <twn  data 

0.0005x600         *OU     per.W 

As  seen  in  Figure  6.4,  as  the  system  load  increases,  the  covert  channel  capacity 
decreases.  The  effect  of  handling  policies  on  covert  channel  capacity  is  as  discussed 
in  the  previous  section. 
Cycle  length,  Tc 
Capacity  oc  1/XC 

The  channel  capacity  depends  on  the  number  of  states  to  which  the  system 
can  transition,  which  is  the  range  of  idle  slots  that  a  period  can  contain.  This 
range  is  computed  by  finding  the  number  of  idle  slots  per  period  for  the  maximum 
and  minimum  variation  in  load.  The  covert  channel  capacity  is  given  by  '  °Si  ^  "  "' . 
Therefore  increasing  the  cycle  length  by  some  factor  will  reduce  the  channel  capacity 
by  the  same  factor. 

While  increasing  cycle  length  reduces  the  channel  capacity,  it  gives  the  sender 
more  time  to  effect  the  changes  to  system  load,  i.e.,  the  sender  can  smooth  out  the 
variation  over  a  longer  period.  This  implies  that  the  interval  over  which  the  variability 
of  traffic  was  considered  to  be  "normal"  will  be  extended,  making  it  more  difficult  to 
audit  the  channel. 

Granularity  of  change  in  number  of  idle  slots,  8 
Capacity  oc  l/<5 
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Auditable  Channel  Capacity  with  Different  Delta 
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Figure  6.5.  Noiseless  Covert  Channel  Capacities  for  Different  Delta 
Given  a  range  of  idle  slots  that  a  period  can  contain,  the  number  of  states 
depends  on  the  granularity  of  change,  6.  The  coarser  the  granularity,  smaller  is  the 
number  of  possible  states,  thereby  reducing  the  channel  capacity. 

Figure  6.5  shows  the  effect  of  granularity  on  covert  channel  capacity.  The  solid 
curve  shows  the  channel  capacity  under  maximum  load  conditions  and  the  dashed 
curve  shows  the  capacity  under  minimum  and  average  load  conditions.  Due  to  the 
handling  policies,  the  channel  capacities  at  minimum  and  average  load  conditions  are 
almost  the  same. 

Coarser  granularity  means  that  the  sender  should  be  able  to  cause  a  large  enough 
change  in  the  system  load  for  the  number  of  idle  slots  to  change.  This  improves  the 
auditability  of  the  channel.  Finer  granularity  allows  the  sender  to  change  the  load 
by  a  very  small  fraction  of  the  current  load  and  still  manage  to  transmit  a  symbol. 
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However,  coarser  granularity  reduces  the  system's  responsiveness.   The  system 
will  remain  in  the  current  state  unless  the  system  load  has  significantly  changed.  This 
might  lead  to  poor  quality  of  service  and  under-utilization  of  system  resources. 
Auditability  threshold,  0 
Capacity  oc  8 

Auditability  threshold,  6,  determines  the  variation  in  system  load  that  will  be 
accepted  as  "normal";  any  variation  above  this  threshold  is  candidate  for  audit  anal- 
ysis. This  threshold  is  determined  by  studying  the  traffic  characteristics  and  is  set 
to  a  value  such  that  most  of  the  variations  that  occur  during  the  course  of  normal 
system  operation  is  excluded  from  scrutiny. 

The  threshold  should  be  set  so  that  no  potential  covert  channel  usage  is  unde- 
tected and  false  detection  of  covert  channel  use  is  avoided.  A  larger  than  optimal 
threshold  will  allow  potential  usage  of  the  covert  channel  to  go  undetected.  A  smaller 
than  optimal  threshold  will  trigger  too  many  spurious  audit  events  using  up  expensive 
resources  and  reducing  the  confidence  in  the  audit  mechanism  by  auditing  innocu- 
ous events.  Figure  6.3  shows  the  effect  of  different  auditability  thresholds  on  covert 
channel  capacity. 

6.6.1     Limiting  Channel  Capacity  to  Desired  Levels 

Having  discussed  the  effects  of  various  parameters  on  covert  channel  capacity, 
in  this  section  we  give  a  procedure  to  determine  the  values  of  the  parameters  for  a 
desired  channel  capacity.  We  note  that  while  parameters  such  as  the  number  of  nodes 
in  the  network,  system  load  and  slot  time  are  fixed  by  the  underlying  network,  the 
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network  administrator  can  decide  parameters  such  as  the  auditability  threshold,  0, 
and  the  granularity  of  change  in  the  number  of  idle  slots,  S. 

The  auditability  threshold  primarily  depends  on  the  variability  of  the  system 
load  and  the  degree  of  "completeness"  desired  of  the  auditability  analysis.  The  gran- 
ularity of  change  in  the  number  of  idle  slots  depends  on  the  covert  channel  capacity 
that  we  are  willing  to  accept  in  lieu  of  system  responsiveness  and  the  handling  policy. 

For  a  given  number  of  nodes  in  the  network,  n,  the  slot  time,  T3,  the  number  of 
idle  slots  in  period  for  any  particular  load  is  given  by 
Number  of  idle  slots  =  Total  number  of  slots  —  Number  of  Active  slots, 
idle  =  S"~}' .  —  nin  —  1),  where  load  is  in  packets  per  node. 

Since  the  covert  channel  capacity  depends  on  the  number  of  states,  we  have  to 
find  the  range  of  the  number  of  idle  slots  in  a  period.  Since  the  auditability  threshold 
is  0,  the  variation  in  load  that  we  need  to  consider  is  load  ±  0. 
Range  of  idle  slots  =  (idle  slots  at  load  —  0  x  load)  —  (idle  slots  at  load  +  0  X  load) 

ran9e  =  lT,x(i}"d-Loa<t)  ~  »("  "  01  "  [r,x(W+VxW)  -  "("  -!)]  +  ! 
where  the  number  of  active  slots,  n(n  —  1),  is  deducted. 

Simple  algebraic  manipulation  gives  us 

ran9e  =  fcb((I=s)  "  (ii»))  +  ] 
where  a  =  '"^  '. 

Therefore  we  have 

range  =  j^j^ 
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From  this  relation,  we  know  the  maximum  covert  channel  capacity  without 
handling.  In  handling  the  covert  channels,  we  attempt  to  reduce  it  potential  capacity 
by  reducing  the  range  of  states  possible.  This  is  done  by  varying  the  granularity  of 
change  in  the  number  of  idle  slots  either  by  a  constant  amount  or  by  making  the 
change  proportional  to  the  system  load. 

Once  we  know  the  range  of  the  idle  slots  in  a  period,  we  can  derive  the  covert 
channel  capacity,  C,  as 
C  =  '°g2(^'  bps 
log2(ran</e)  —  log2  6  <  Tc  C,  or 
log2  <5  =  \og2(range)  -  Tc  C 
where  6  is  the  granularity  of  change  in  the  number  of  idle  slots. 

If  we  are  given  C,  the  desired  covert  channel  capacity,  we  can  find  the  granularity 
of  the  change  in  the  number  of  idle  slots  as 
log2  S  >  log2(rarc</e)  —  Tc  C,  or 

$>  2og2(T*ngC)-Tc  C  ^  Qr 

If  the  handling  policy  is  fixed,  then  the  the  granularity,  6,  is  as  computed.  If  we 
use  proportional  handling  policy,  then  6  =  ax  slots,  where  slots  is  the  total  number 
of  slots  in  a  period.  Using  the  above  relation,  we  can  determine  the  granularity  of 
the  change  in  the  number  of  idle  slots  such  that  the  covert  channel  remains  constant 
regardless  of  the  changes  to  the  system  load. 
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6.7     Spatial  Covert  Channels 

In  this  section,  we  estimate  the  covert  channel  capacity  due  to  spatial  variation 

in  the  traffic  matrix.    Given  a  traffic  matrix,  the  covert  channel  capacity  depends 

on  the  number  of  distinct  traffic  matrices  that  can  be  constructed  before  spatial 

neutrality  is  achieved.  Therefore  every  intermediate  traffic  matrix  itself  constitutes 

a  symbol.  The  covert  channel  capacity  is  derived  as 
iog2(No.  of  distinct  TM  of  time  T) 

To  derive  an  upper  bound  on  the  spatial  covert  channel  capacity  without  any 
handling,  we  have  to  determine  the  effect  that  a  node  can  have  on  the  structure  of 
the  traffic  matrix.  In  our  case,  we  select  the  node  with  the  largest  volume  of  outgoing 
communication.  Since  this  node  can  transmit  each  of  its  k  packets  to  one  of  n  —  1 
destinations,  there  are  (n  —  l)k  possible  combinations,  giving  us  a  channel  capacity 
of  the  order  of  Hog(n  —  1)  bits  for  the  observation  interval.  Therefore  the  channel 
capacity  C  =  tlos^"~1)  where  T  is  observation  time.  If  the  traffic  matrix  has  just  one 
packet  to  transmit  and  the  observation  interval  is  set  to  the  time  required  for  the 
transmission  of  one  packet,  i.e.,  if  T  =  l/(rate),  then  we  get  the  upper  bound  for 
spatial  covert  channel  as  C  <  log(n).  This  is  an  upper  bound  because  we  neglect 
the  order  of  transmission.  For  a  more  accurate  measure  of  the  spatial  covert  channel, 
we  have  to  eliminate  the  effects  caused  due  to  order  of  transmission  on  the  traffic 
matrices. 
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Figure  6.6.  Spatial  Covert  Channel  Capacity,  Volume  0-100 
Figure  6.6  shows  the  channel  capacity  for  a  uniformly  distributed  traffic  volume 
in  the  range  [0  •  ■  •  100]  packets.    As  seen  in  the  figure,  the  covert  channel  capacity 
increases  with  the  dimension  of  the  traffic  matrix. 

To  determine  the  spatial  covert  channel  without  the  effect  of  ordering,  we  again 
state  the  problem  as  one  of  finding  the  partitions  of  a  positive  integer.  Given  the 
total  volume  of  communication  in  the  traffic  matrix  as  an  integer  M,  each  partition 
of  M  can  be  interpreted  as  a  distinct  traffic  matrix  with  the  total  volume  of  commu- 
nication being  M,  but  with  different  internode  communication  patterns.  Since  we  are 
interested  in  the  total  number  of  such  traffic  matrices,  without  taking  the  order  of 
transmission  into  consideration,  we  are  interested  in  finding  the  number  of  partitions 
of  M,  given  by  P(M).  Therefore  the  capacity  of  the  spatial  covert  channel  will  be 
C  =  °6'  j — "■.  As  discussed  in  5.2.3,  there  are  no  closed  form  solutions  to  find  P(M) 
but  routines  to  compute  the  number  of  partitions  is  presented  in  Nijenhuis[42]. 
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6,8     Summary  of  Auditabilitv  Analysis 

As  discussed  above,  network  covert  channels  can  be  effectively  contained  using 
relatively  simple  audit  mechanisms.  Our  analysis  of  the  traffic  on  ECSNET  showed 
that  the  normal  variation  in  the  traffic  is  ±24%  of  the  current  load  on  the  system. 
Therefore  we  set  the  audit  threshold  at  24%. 

Covert  channel  capacity  is  estimated  for  minimum,  average  and  maximum  load 
conditions  with  and  without  handling  policies.  It  is  observed  that  the  handling 
policies  reduces  the  channel  capacity  to  TCSEC  acceptable  levels.  We  have  also 
estimated  the  channel  capacity  for  various  auditability  thresholds,  6,  granularity  of 
change  in  the  number  of  idle  slots,  6,  and  the  covert  channel  capacity  for  various  load 
conditions  with  and  without  any  handling  mechanism  for  noiseless  covert  channels. 

While  it  is  difficult  to  audit  noisy  channels,  audit  mechanisms  that  employ 
knowledge  of  recent  network  behavior  by  maintaining  a  history  of  resource  utilization 
for  each  node  and  the  network  as  a  whole  can  better  audit  noisy  covert  channels.  Any 
change  in  a  particular  node's  traffic  is  compared  with  previous  variations  and  if  the 
current  variation  is  unusual,  then  the  node  is  audited. 

An  active  intruder  can  continuously  vary  system  load  either  to  actually  exploit 
the  covert  channel  to  communicate  with  an  accomplice  or  to  fabricate  an  history  of 
constantly  varying  traffic  characteristics.  Such  a  behavior  will  foil  the  above  handling 
policy.  Therefore  in  addition  to  monitoring  individual  nodes  for  variations  in  traffic 
characteristics,  monitoring  the  variability  of  variations  in  traffic  characteristics  of  a 
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node  can  give  additional  insights  into  the  behavior  of  the  nodes  and  detect  potential 
covert  channel  usage. 

6.9     Conclusion 

Static  scheduling  policy  guarantees  spatial  and  temporal  neutrality  at  the  ex- 
pense of  the  responsiveness  of  the  system.  On  the  other  hand,  adaptive  scheduling 
policy  improves  the  responsiveness  of  the  system  at  the  expense  of  certain  covert 
channels  due  to  temporal  variations  in  traffic  characteristics. 

To  contain  the  covert  channels  that  exist  due  to  temporal  variation  in  traffic 
characteristics,  we  define  an  audit  strategy.  Based  on  our  measurements  of  traffic  on 
ECSNET,  we  define  the  parameters  of  our  audit  and  handling  policy  which  include  the 
audit  threshold  and  the  granularity  of  change  in  the  number  of  idle  slots  in  a  period. 
The  effects  of  system  load,  cycle  length,  granularity  of  change  and  audit  threshold  on 
covert  channel  capacity  is  studied  and  results  presented.  Our  analysis  indicate  that 
auditing  covert  channels  is  an  effective  method  of  handling  a  known  covert  channel 
by  detering  its  potential  users  and  the  handling  policies  that  we  suggest  reduce  the 
covert  channel  capacity  to  TCSEC  acceptable  levels. 

In  the  next  chapter,  we  present  a  performance  analysis  of  the  model  for  spatial 
neutrality  and  validate  analytical  results  with  a  simulation  study  based  on  the  traffic 
trace  from  our  measurements  on  UFNET.  The  performance  of  the  heuristic  algorithm 
is  also  compared  with  an  integer  programming  implementation. 


CHAPTER  7 
PERFORMANCE  ANALYSIS  OF  THE  MODEL 

7.1     Introduction 

Load  cost,  a  primary  concern  in  this  work,  is  formulated  as  minimization  of 
rerouted  messages  and  dummy  messages.  A  solution  using  rerouting  may  have  low 
load  cost  but  a  larger  cost  in  delay  than  an  approach  based  on  dummy  packets  alone. 
This  chapter  analyzes  the  model  to  prevent  traffic  analysis  by  rerouting  and  padding 
the  traffic  matrix,  so  that  the  apparent  final  traffic  matrix  is  neutral.  The  objective 
of  this  analysis  is  to  justify  the  claims  of  the  model  and  to  show  that  rerouting  of 
traffic  via  intermediate  nodes  with  minimal  padding  is  indeed  a  cost  effective  method 
to  prevent  traffic  analysis.  Simulation  results  supporting  the  above  claim  are  also 
presented  in  this  chapter. 

We  will  first  describe  an  algorithm  for  deriving  spatially  neutral  traffic  matrix. 
The  performance  of  this  algorithm  on  a  uniformly  distributed  traffic  matrix  is  com- 
pared with  a  linear  program  implementation  of  the  reroute  strategy.  We  then  use 
traffic  characteristics  from  measurements  done  on  the  University  of  Florida  campus 
wide  backbone  network  (UFNET)  to  model  an  actual  network.  Simulation  results 
show  that  the  algorithm  performs  better  on  a  sparse  traffic  matrix  by  reducing  the 
reroute  overhead  required  to  achieve  neutral  traffic  matrix.    On  the  other  hand,  a 
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sparse  traffic  matrix  leads  to  increased  padding  costs  (in  the  absence  of  rerouting). 
Experiments  done  with  UFNET  traffic  characteristics  show  that  the  algorithm  can  be 
employed  in  actual  networks  to  achieve  traffic  neutrality  with  acceptable  overheads. 

7.2     Algorithm  for  Spatial  Neutrality 

The  algorithm  accepts  as  input  the  original  traffic  matrix,  and  tries  to  generate 
a  traffic  neutral  matrix,  such  that  the  cost  is  minimum  one  that  will  support  the 
necessary  true  traffic.  The  subroutine  we  use  accepts  the  original  traffic  matrix  and 
a  target  traffic  matrix  as  inputs  and  finds  a  feasible  rerouting  strategy  such  that  the 
apparent  traffic  matrix  dominates  the  target  traffic  matrix,  i.e.,  each  element  of  the 
apparent  traffic  matrix  is  greater  than  or  equal  to  the  corresponding  element  in  the 
target  traffic  matrix.  Our  goal  is  to  determine  R,  the  vector  of  reroute  quantities. 
The  apparent  traffic  matrix  after  rerouting  can  be  padded,  if  necessary,  with  dummy 
messages  in  order  to  make  the  final  apparent  traffic  matrix  neutral.  Although  the 
target  traffic  matrices  considered  here  are  all  neutral,  this  is  not  a  requirement  of  the 
subroutine  nor  the  model. 

The  algorithm  to  generate  the  neutral  traffic  matrix,  given  the  input  traffic 
matrix  is  given  in  figure  7.1.  It  attempts  to  find  an  minimum  cost  reroute  matrix 
and  then  pad  each  element  to  the  highest  element.  To  compute  the  reroute  quantities 
for  any  node  i,  we  first  compute  the  mean  volume  of  traffic,  meant,  from  this  node 
to  every  other  node  in  the  network.  If  the  volume  of  traffic  from  the  node  i  to  any 
other  node  j  is  larger  than  the  meant  then  we  try  to  reroute  the  additional  traffic  via 
some  intermediate  node  k.   If  node  k  can  absorb  all  the  packets  that  node  i  wishes 
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to  reroute  via  it,  its  volume  still  being  less  than  or  equal  to  the  meant,  this  is  done. 
Otherwise  node  i  reroutes  via  node  k,  a  volume  of  traffic  that  will  make  the  node  i  to 
node  k  traffic  equal  to  the  mtarii  and  then  try  to  reroute  the  remaining  packets  via 
some  other  intermediate  node.  This  process  is  repeated  for  each  node  after  which  all 
the  non-diagonal  elements  of  the  matrix  are  approximately  equal  to  the  mearii.  Now 
we  find  the  maximum  element  in  the  traffic  matrix  and  pad  all  the  other  elements 
with  values  less  than  the  maximum  to  be  equal  to  the  maximum,  yielding  a  neutral 
final  traffic  matrix.  The  time  and  space  complexity  for  this  algorithm  is  0(n2).  For  a 
traffic  matrix  with  80  nodes  where  the  maximum  traffic  between  any  pair  of  node  is 
100  packets,  an  implementation  of  algorithm  shown  in  figure  7.1  takes  approximately 
1.2  seconds  on  a  Sun  4  to  compute  the  neutral  traffic  matrix  in  both  cases:  padding 
only  and  padding  with  rerouting. 

7.2.1     Expected  Cost  Analysis 

Let  us  assume  that  the  volume  of  internode  traffic  is  random  (with  uniform  dis- 
tribution) and  that  there  is  no  self  communication.  We  also  assume  that  transmission 
of  each  packet  contributes  one  unit  to  the  total  transmission  cost.  This  implies  that 
there  are  at  most  n(n  —  1)  non-zero  elements  in  the  traffic  matrix. 
Therefore  the  expected  cost  of  the  original  traffic  matrix  of  dimension  n  x  n  is 

Cost  =  eS_1)  A';, 
where  X{  is  the  cost  of  a  single  element  in  the  traffic  matrix. 

The  expected  cost  of  the  initial  traffic  matrix  is 

Efco^EfErir1'*.-) 
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Input:  An  n  xn  traffic  matrix  M 

;  Element  M[i,j]  represents  i  — >  j  traffic 
Output:  An  n  x  n  neutral  traffic  matrix  T/ 
An  n  x  n  Reroute  matrix 
Annxn  Padding  matrix 
for  i  —  1  to  n  do, 

;  Represents  the  traffic  from  a  given  node  i  to  all  other  nodes  in  the  network. 
;  Compute  the  mean  traffic  for  the  row  i. 

En        M[i,j] 

for  j  =  1  to  n  do 

if  (  (i  ^  j)  A  (M[i,j]  >  mean)  )  then 
excess-traffic  =  mean  —  M[i,j] 
;  If  traffic  volume  from  node  i  to  j  is  more  than  the  mean  volume  of  traffic 
;  from  i  to  every  other  node  then  we  may  reroute  the  excess  traffic. 
;  via  an  intermediate  node  k,  k  €  [1  ■  •  ■  n] 
while  (  excess-traffic  >  0)  do 
if  (M[i,k]  +  excess-traffic  <  mean  )  then 

;  reroute  traffic  via  an  intermediate  node,  k 
M[i,j]  =  M[i,j]  —  excess-traffic 
M[i,k]  =  M[i,k]  +  excess-traffic 
M[k,j]  =  M[k,j]  +  excess-traffic 

;  Add  reroute  packets  to  M[k,j]  due  to  non-local  effect. 
excess-traffic  =  0; 
Else  if  (M[i,k]  +  excess-traffic  >  mean  )  then 

;  reroute  traffic  via  an  intermediate  node  k  such  that  M[i,k]  =  mean, 
reroute-qty  =  mean  —  M[i,k] 
M[i,j]=  M[i,j]  —  reroute-qty 
M[i,k]=  M[i,k]  +  reroute-qty 
M[k,j]=  M[k,j]  +  reroute-qty 

;  Add  reroute  packets  to  M[k,j]  due  to  non-local  effect. 
excess-traffic  =  excess-traffic  —  reroute-qty; 
k=k+l 

;  Try  to  reroute  remaining  excess-traffic  via  another  intermediate  node. 
Endif 
EndWhile 
EndFor 

;  Finish  rerouting  packets  within  a  row 
EndFor 

;  Finish  rerouting  packets,  output  Reroute  Matrix 
;  Compute  Padding  Matrix  to  get  Neutral  M 
max-element  =  max(M[i,j]) 
for  each  element  in  M 
Padding[i,j]  =  max-element  —  M[i,j] 
;  Padding  is  done  after  rerouting 
EndFor 

;  Compute  Cost  and  Overhead  due  to  Rerouting  +  Padding 
Tj  =  M[i,j]  +  Padding[i,j] 
Output  Neutral  Traffic  Matrix  Tj 

Figure  7.1.  Algorithm  to  Reroute  Packets 
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=  EFir1'  e(x.) 

=  Eii_1)(M  +  m)/2  =  »("  -  !)M/2 

where  M  is  the  maximum  value  of  the  uniform  distribution  and  m  is  the  minimum 
value  (m=0). 

To  achieve  neutral  matrix  by  padding  only,  we  pad  all  elements  in  the  matrix 
to  be  equal  to  the  maximum  element  in  the  matrix.  We  derive  the  expected  cost  due 
to  padding,  E(costp). 

Let  k  =  n(n  —  1)  and  X  =  max(Xi). 
Therefore  the  cost  due  to  padding  is 

costp  =  zLAx  ~  Xi) 

and  the  expected  cost  due  to  padding,  E(costp),  is 
E(costp)=  kE(X)  -ZliE(X,) 

=  kE(X)  -  kM/2 
Let  Pa  =  Prob(X<  a)  -($&)* 


E(X)=  M  -  ES1  P«  =  M-  E^o1  (0;)" 

Therefore, 

E(coStp)  =  k(f-IK^Z^a") 
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7.3     Performance  Analysis  using  Simulation:  Uniform  Traffic 
7.3.1     Simulation  Model:  Uniform  Traffic  Distribution 

The  algorithm  in  figure  7.1  was  simulated  on  traffic  matrices  of  various  dimen- 
sions and  various  traffic  load  ranges  between  source-destination  pairs.  A  random 
number  generator  with  uniform  distribution  was  used  to  generate  the  traffic  matri- 
ces. A  simulation  run  consisted  of  generating  traffic  matrices  with  dimensions  varying 
from  4  x  4  to  75  x  75,  each  element  (representing  the  volume  of  traffic)  in  the  range 
0-100,  0-50,  0-25  or  0-15  for  a  particular  simulation  run.  The  transmission  cost  of  this 
original  matrix  was  computed  and  then  the  algorithm  was  employed  to  get  a  neutral 
traffic  matrix.  The  transmission  cost  and  the  overhead  incurred  due  to  rerouting 
and  padding  was  compared  and  plotted  with  respect  to  the  original  cost.  The  same 
analysis  was  done  for  the  case  when  a  neutral  traffic  matrix  was  obtained  by  padding 
only.  In  each  case,  we  have  also  computed  and  plotted  the  minimum  cost  predicted  by 
the  model  to  achieve  a  neutral  final  traffic  matrix  by  rerouting  only  and  the  expected 
cost  due  to  padding  only.  See  figure  7.2,  figure  7.3,  figure  7.4,  figure  7.5. 

To  increase  our  confidence  in  the  results  obtained,  we  repeated  the  simulation 
thrice,  each  with  a  different  seed  for  the  random  number  generator.  From  the  results 
obtained,  the  coefficient  of  variability  of  rerouting  cost  was  small,  and  the  standard 
deviation  for  the  rerouting  cost  for  higher  dimension  matrices  was  approximately 
zero.  This  observation  was  not  unexpected  as  the  possibility  of  rerouting  packets 
via  intermediate  nodes  increases  considerably  in  higher  dimensional  matrices  and  the 
reroute  traffic  tends  to  "even  out"  without  excessive  padding,  leading  to  minimal 
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rerouting  costs.  It  was  also  observed  that  the  algorithm  performed  equally  well  in 
the  case  where  the  internode  traffic  was  0-100  packets  as  in  the  case  where  internode 
traffic  was  0-15  packets.  Using  the  analysis  of  variants  (ANOVA)  technique,  we  found 
that  at  95%  level  of  significance  there  is  no  difference  in  the  mean  costs  of  the  reroute 
matrix  in  seven  trials.  For  testing  the  hypothesis  of  no  difference  in  the  mean  cost 
of  the  reroute  matrix  in  the  seven  trials  at  95%  significance,  (using  the  one  way 
ANOVA),  we  obtained  a  test  statistic  of  1.98  (table  value  at  0.95  F(6,oo  =  2.10)), 
i.e.,  observed  test  statistic  is  less  than  table  value.  Hence  we  accept  the  hypothesis 
of  no  difference  in  the  mean  costs. 

The  expected  padding  cost  matches  simulation  results  closely  for  lower  dimen- 
sion networks,  but  appears  to  underestimate  them  for  larger  networks.  The  lower 
bound  for  rerouting  cost  appears  to  indicate  that  the  lower  bound  on  the  expected 
rerouting  costs  represents  about  half  of  the  additional  costs.  Rerouting  plus  padding 
gives  us  a  neutral  traffic  matrix  at  about  twice  the  cost  of  the  original  traffic  matrix, 
and  at  about  half  to  one  third  the  cost  of  padding  alone. 

7.4     Integer  Programming  Formulation 

The  transshipment  problem  is  any  problem 

minimize  ex  subject  to  Ax  =  b,  x  >  0 

such  that  A  is  the  n  x  m  incidence  matrix  of  some  network  and  such  that 

EU  b,  =  0 
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Original  Cost 

Traffic  Volume:  0-15 


Number  of  Nodes,  n 


Figure  7.2.  Transmission  Cost:  Uniform  Traffic,  Volume  0-15 
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Figure  7.3.  Transmission  Cost:  Uniform  Traffic,  Volume  0-25 
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Figure  7.4.  Transmission  Cost:  Uniform  Traffic,  Volume  0-50 
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Figure  7.5.  Transmission  Cost:  Uniform  Traffic,  Volume  0-100 
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In  our  case,  we  are  interested  in  integer  solutions  to  the  problem.  Therefore  we 
will  formulate  the  problem  of  rerouting  traffic  to  achieve  a  neutral  traffic  matrix  as 
an  integer  linear  programming  problem. 

Problems  of  the  form 

minimize  ex  subject  to  Ax  =  b,  x  integer  valued 
are  known  as  integer  linear  programming  problems.  Such  problems  are  very  hard  to 
solve,  both  in  theory  and  practice.  Yet  if  A  happens  to  be  the  incidence  matrix  of 
a  network,  then  the  problem  can  be  solved  quite  efficiently  by  the  network  simplex 
method. 

We  will  now  give  an  integer  linear  programming  formulation  of  our  original 
problem  defined  in  chapter  3.  For  a  given  traffic  matrix  M,  and  a  known  target 
traffic  matrix  T,  the  rerouting  information  can  be  found,  subject  to  the  feasibility 
constraints  discussed  below,  by  solving 

T,  =  DM  x  R  +  Mf  +  Pf, 

We  wish  to  minimize  P,  the  amount  of  padding,  given  a  neutral  target  traffic 
matrix  (i.e,  given  an  integer  a  such  that  T  =  aN0). 
T  =  E,-,jy  R,,k,j  r,,k<1  +  M  +  P 
(T-P)  =  E;,fcj  fliAj  riAj  +  M 
N  =  J2Rr  +  M, 

where  jV  is  the  real  packet  traffic  between  nodes  i,j 
N  =  Ax  +  M 
where  A  is  the  basic  structure  of  the  network  and  can  represent  any  topology.   For 
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Figure  7.6.  LP  Formulation  to  Compute  Neutral  TM 
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Figure  7.7.  Structure  of  LP  Formulation 

example,  as  shown  in  figure  7.6,  A{j  =  —  l,An,  =  l,Akj  =  1  for  r^.,-.   N  is  the  true 

traffic  volume  between  any  pair  of  node  i,  j. 

Therefore, 

minimize  [maximum  Vi,  j  N{j] 

given  Tij  =  aN0  and  T  =  N  +  P 
Therefore, 
T  =  Ar  +  M  +  P 
T  -  M  =  Ar  +  P  =  b,  or 
T.,,,Ti,1-Mij>Q 
Ax  =  b,  where  A,  x  and  6  are  as  shown  in  Figure  7.7. 

minimize  ex,  x  >  0  and  X  in  integer. 
Since  5^  6  ^  0,  we  do  not  have  a  system  of  linear  equations  but  we  have  an  integer 
linear  program  formulation  of  the  problem. 
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7.4.1      GAMS  Implementation 

GAMS  (General  Algebraic  Modeling  System)  is  a  software  package  designed  to 
make  the  construction  and  solution  of  large  and  complex  mathematical  programming 
models  easier[6].  GAMS  currently  accommodates  linear,  nonlinear  and  mixed  integer 
optimization  models,  as  well  as  the  special  cases  of  simultaneous  linear  or  nonlinear 
systems.  For  our  implementation,  we  used  GAMS  version  2.05  on  VAX/VMS  version 
V5.5-2. 

Our  original  problem  computing  the  rerouting  quantities  is  implemented  as  a 
transshipment  problem,  the  objective  function  being  to  minimize  the  maximum  el- 
ement in  the  final  neutral  traffic  matrix.  The  MIP  (Mixed  Integer  Programming) 
problem  solver  was  used  to  solve  our  transshipment  model.  The  model  was  used  to 
determine  the  reroute  cost  for  traffic  matrices  of  sizes  ranging  from  4  x  4  to  9  x  9.  A 
random  number  generator  with  uniform  distribution  was  used  to  generate  the  traffic 
matrices,  each  element  in  the  range  0-15. 

The  transmission  volume  or  cost  of  this  original  matrix  was  computed  and  then 
the  algorithm  was  employed  to  get  a  neutral  traffic  matrix.  The  transmission  cost 
and  the  overhead  incurred  due  to  rerouting  was  compared  and  plotted  with  respect 
to  the  original  cost.  The  same  analysis  was  done  for  the  case  when  a  neutral  traffic 
matrix  was  obtained  using  the  heuristic  to  compute  the  expected  reroute  cost.  In  each 
case,  we  have  also  computed  and  plotted  the  expected  minimum  cost  predicted  by 
the  model  to  achieve  a  neutral  final  traffic  matrix  by  rerouting  only  and  the  expected 
cost  due  to  padding  only. 
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Figure  7.8.  Reroute  Cost  by  Linear  Programming  Versus  Heuristic 
From  the  figure  7.8,  we  can  see  that  the  cost  of  neutral  traffic  matrix  computed 
by  the  the  algorithm  is  within  10%  of  the  cost  computed  by  integer  programming 
formulation  for  lower  dimension  matrices  and  is  approximately  within  30%  of  the 
cost  for  larger  dimension  matrices. 

To  find  a  globally  optimal  solution,  we  suggest  the  following  approach.  We  select 
a  feasible  initial  solution  in  which  the  elements  of  the  neutral  traffic  matrix  is  set  to 
the  largest  element  of  the  original  traffic  matrix.  This  guarantees  a  feasible  solution 
with  integer  quantities.  Now  we  use  a  binary  search  method  to  reduce  the  search 
for  a  solution  that  is  feasible  and  optimal.  Due  to  the  integrality  theorem,  all  such 
solutions  will  have  integer  values. 
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7.5     Performance  Analysis  using  Simulation:  UFNET  Traffic 

In  this  section,  we  extend  the  model  presented  earlier  by  employing  more  realistic 
simulation  parameters.  Measurements  done  on  the  University  of  Florida  campus  wide 
backbone  network  (UFNET)  provides  us  with  considerable  experience  to  modify  our 
earlier  simulation  better  to  model  an  actual  network.  In  this  modification,  we  wish 
to  discard  the  assumption  that  traffic  volume  and  the  actual  communication  between 
various  nodes  in  the  network  is  uniformly  distributed  and  instead  use  a  more  realistic 
traffic  distribution.  However,  we  acknowledge  that  some  simplifying  assumptions  are 
still  retained  to  keep  the  network  model  and  its  simulation  tractable.  We  will  present 
our  assumptions  and  the  justification  for  making  these  assumptions  in  subsequent 
sections. 

7.5.1      Observations  on  ECSNKT  Traffic  Characteristics 

In  an  independent  study,  we  undertook  to  characterize  the  traffic  on  the  Uni- 
versity of  Florida  campus  wide  backbone  network  (UFNET).  Measurements  carried 
out  on  UFNET  over  a  period  of  more  than  one  year,  at  different  points  in  the  net- 
work, gave  us  considerable  insight  into  the  nature  of  network  traffic  and  brought  into 
focus  various  subtle  interactions  of  the  network  architecture  and  user  behavior.  One 
such  observation  relevant  to  our  simulation  model  is  that  neither  the  intra-LAN  nor 
the  inter-LAN  network  traffic  are  uniformly  distributed.  In  fact,  the  traffic  distri- 
bution depends  to  a  large  extent  on  the  connectivity,  the  logical  and  administrative 
domains,  the  applications  running  on  the  nodes  and  the  user  behavior.    Based  on 
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these  observations,  we  reviewed  the  simulation  model  and  found  that  there  is  scope 
for  improvement. 

Measurements  were  performed  on  the  ECSNET  (Engineering  Consulting  Ser- 
vices Net,  a  subnetwork  of  UFNET).  The  percentage  of  traffic  for  intra-LAN,  inter- 
LAN  and  LAN- WAN  traffic  observed  for  these  networks  were  37.04%,  57.07%,  5.89% 
respectively.  The  total  volume  of  communication  over  a  typical  24  hour  period  was 
31,202,376  packets,  average  packet  size  275  bytes  per  packet.  The  traffic  matrix  for 
the  nodes  in  ECSNET  is  shown  in  table  7.1.  This  traffic  matrix  is  used  as  a  basis 
to  generate  the  traffic  matrices  for  simulation  model  2  (discussed  in  section  7.5.3). 
The  12  X  12  matrix  represents  the  traffic  between  12  nodes  in  the  ECSNET.  The 
maximum  element  in  the  matrix  is  369  Mbytes  (between  Wasp  and  Bigguy),  the 
minimum  (among  communicating  node  pairs)  is  60  bytes.  If  node  i  does  not  com- 
municate with  node  j,  then  TM[i,j]  =  0.  The  mean  volume  is  32  Mbytes  and  the 
standard  deviation  is  106  Mbytes.  The  maximum  cumulative  volume  received  was 
383  Mbytes  by  Wasp  and  the  maximum  cumulative  volume  transmitted  was  377 
Mbytes  by  Bigguy.  See  figure  7.9  and  figure  7.10  respectively. 

7.5.2     Modeling  Issues 

In  this  section,  we  discuss  some  of  the  assumptions  and  design  choices  made  in 
modeling  the  network  and  the  simulation. 

1.  Traffic  Distribution: 

Traffic  distribution  is  closely  related  to  the  topology,  connectivity  and  applica- 
tions. It  is  also  related  to  factors  such  as  the  purpose  or  objective  of  the  network, 
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Figure  7.9.  Total  Bytes  Received  by  Nodes  In  ECSNET 
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Figure  7.10.  Total  Bytes  Transmitted  by  Nodes  In  ECSNET 
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Table  7.1.  Traffic  Matrix  of  Nodes  in  ECSNET,  UFNET 
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administrative  policies,  user  interaction,  etc.  However,  viewing  the  traffic  as  a 
packet  stream,  the  packet  train  model  indicates  a  strong  (bidirectional)  correla- 
tion between  the  flow  of  traffic  between  source  and  destination[24].  Therefore, 
in  this  simulation,  we  use  a  non-uniform  (but  correlated)  distribution,  extracted 
from  measurements  done  on  UFNET. 
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To  this  end,  we  set,  V  i,j,  TM[i,j]  =s  TM{j,i],  i.e.,  the  volume  of  communi- 
cation is  symmetric.  However,  in  some  cases,  communication  is  predominantly 
unidirectional,  i.e,  TM[i,j]  «  TM[j,  i].  For  example,  in  a  client-server  session 
with  an  NNTP  server,  communication  (by  volume)  is  primarily  unidirectional 
from  the  server  to  the  client  nodes.  The  clients  send  requests  (usually  small 
packets)  for  particular  news  items;  the  server  responds  by  sending  the  articles 
(usually  much  larger  packets  in  comparison  to  the  request). 

2.  Sparsity  of  Traffic  Matrix: 

Sparsity  depends  on  the  actual  communication  between  each  pair  of  nodes. 
Note  that  we  distinguish  "connectivity"  (existence  of  a  link)  from  "communi- 
cation" (actual  exchange  of  packets).  Each  node  communicates  largely  with  a 
small  set  of  favored  nodes  and  occasionally  with  a  few  other  nodes.  This  sug- 
gests that  the  traffic  matrix  (of  a  large  network)  would  be  a  very  sparse  matrix. 
If  node  i  is  in  node  j's  set  of  favored  nodes,  Cj,  then  it  is  highly  likely  that 
node  j  will  be  in  node  i's  favored  set,  C,,  too. 

Or  simply,  V  i,j,  if  i  €  Cj  «-»  j  £  C;. 

In  some  cases,  however,  TM[i,j]  =  0  but  TM[j,i]  ^  0  ,  i.e.,  there  is  no  com- 
munication from  i  to  j.  For  example,  even  if  there  is  no  user  or  application 
initiated  communication  between  two  nodes,  there  might  be  some  (control) 
application  like  NTP,  Route,  ICMP,  etc.,  that  might  exchange  packets  period- 
ically. Therefore  the  corresponding  traffic  matrix  entry  would  be  non-zero.  In 
our  simulation,  for  simplicity,  such  entries  are  taken  to  be  zeroes. 
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3.  Topology: 

Given  that  we  are  dealing  with  special  networks  that  are  designed  to  operate 
under  various  security  threats  (traffic  analysis  being  one  of  them),  it  is  quite 
reasonable  to  retain  the  assumption  of  a  fully  connected  network.  Completely 
connected  networks  improve  reliability,  fault  tolerance,  the  ability  to  accommo- 
date sudden  burst  of  traffic  by  rerouting,  etc.  Given  complete  connectivity  of 
the  underlying  network,  any  logical  configuration  (connectivity)  can  be  set  up, 
as  the  existence  of  a  link  between  nodes  does  not  necessarily  imply  communi- 
cation between  them. 

7.5.3     Simulation  Model:  ECSNET  Traffic  Distribution 

Two  different  simulation  experiments  were  done;  we  first  explain  the  model 
followed  by  a  discussion  of  results.  In  both  experiments,  the  transmission  cost  of  the 
original  traffic  matrix  is  computed  and  the  algorithm  is  then  employed  to  get  a  neutral 
traffic  matrix.  The  transmission  cost  and  the  overhead  incurred  due  to  rerouting  is 
compared  and  plotted  with  respect  to  the  original  cost.  The  same  analysis  is  done 
for  the  case  when  a  neutral  traffic  matrix  is  obtained  by  padding  only.  In  each  case, 
we  also  computed  and  plotted  the  minimum  cost  predicted  by  the  model  to  achieve 
a  neutral  final  traffic  matrix  by  rerouting  and  padding  and  the  expected  cost  due  to 
padding  only. 

In  the  first  experiment,  we  used  the  traffic  distribution  from  UFNET  (non- 
uniformly  distributed)  to  generate  the  traffic  volume.   The  simulation  environment 
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Figure  7.11.  Transmission  Cost:  UFNET  Traffic,  Volume  0-15 
and  setup  and  the  reroute  algorithm  used  is  the  same  as  in  section  7.3.  Each  ex- 
periment consists  of  generating  traffic  matrices  with  the  desired  traffic  distribution. 
We  choose  four  traffic  volume  ranges:  0-100,  0-50,  0-25  and  0-15.  Traffic  matrix 
dimensions  varied  from  4x4,  representing  small  networks,  to  75  X  75,  representing 
larger  networks. 

Figure  7.11,  figure  7.12,  figure  7.13,  and  figure  7.14  show  the  performance  mea- 
sures for  simulation  runs  with  different  traffic  volume  ranges.  Each  experiment  was 
repeated  thrice;  the  figures  show  the  mean  values  of  the  results  obtained  from  these 
experiments. 

From  the  figures,  we  see  an  improvement  in  the  the  performance  of  the  rerouting 
algorithm  compared  to  what  was  presented  earlier.  This  apparent  improvement  can 
be  explained  by  the  following  observations.  Firstly,  the  traffic  matrices  are  sparse  and 
therefore  it  is  much  easier  and  cost  effective  to  reroute  packets  via  those  nodes  with 
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Figure  7.12.  Transmission  Cost:  UFNET  Traffic,  Volume  0-25 
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Figure  7.13.  Transmission  Cost:  UFNET  Traffic,  Volume  0-50 
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Figure  7.14.  Transmission  Cost:  UFNET  Traffic,  Volume  0-100 
which  there  is  no  (or  minimal)  communication  in  the  original  traffic  matrix.  This  is 
more  cost  effective  because  the  non-local  effects  are  not  so  severe  due  to  rerouting  and 
also  fewer  dummy  packets  are  required  to  achieve  neutral  traffic  matrix.  Therefore 
sparse  traffic  matrices  are  not  only  more  realistic  but  also  help  the  rerouting  strategy. 
But  we  hasten  to  add  that  it  is  the  sparsity  of  the  matrix  that  has  caused  an  increase 
in  the  padding  cost  (without  rerouting).  When  we  compute  the  padding  cost  in 
our  model,  we  simply  pad  all  elements  of  the  matrix  to  equal  the  largest  element  of 
the  original  traffic  matrix.  Due  to  the  sparsity,  there  are  more  zero  elements  and 
therefore  more  dummy  packets  are  needed  to  achieve  a  neutral  traffic  matrix.  As 
before,  we  again  observe  that  the  algorithm  performs  better  with  larger  matrices. 
This  is  because  of  possibility  of  rerouting  packets  via  intermediate  nodes  increases 
considerably  in  larger  matrices  and  the  reroute  traffic  tends  to  "even  out"  without 
excessive  padding,  leading  to  minimal  rerouting  costs. 
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7.5.4     Simulation  Model:  ECSNET  Traffic  Matrix 

In  this  experiment,  we  used  the  measurements  done  on  UFNET  as  the  basis  for 
generating  traffic  matrices.  The  purpose  of  this  experiment  was  to  use  actual  traffic 
measurements  to  study  the  performance  our  rerouting  algorithm  on  actual  traffic 
traces. 

Each  experiment  consists  of  randomly  selecting  elements  from  the  given  traffic 
matrix  to  construct  a  traffic  matrix  for  our  experiments.  Traffic  matrix  dimensions 
varied  from  4  x  4  to  12  X  12;  the  experiment  was  repeated  24  times  to  ensure  that 
we  have  statistically  stable  traffic  matrices. 

Figure  7.15  graph  shows  the  performance  of  the  rerouting  policy.  We  note  the 
performance  of  the  rerouting  policy  depends  to  a  large  extent  on  the  actual  distri- 
bution of  traffic  volumes.  We  see  that  the  performance  of  the  rerouting  algorithm 
varies  with  the  traffic  matrix  dimension.  This  can  be  explained  by  the  fact  that 
due  to  the  wide  variation  between  the  maximum  and  minimum  traffic  volumes,  with 
larger  traffic  matrix,  the  algorithm  can  better  reroute  traffic  via  intermediate  nodes, 
thereby  requiring  minimal  dummy  packets  to  pad  the  traffic  to  achieve  a  neutral  traf- 
fic matrix.  This  experiment  shows  that  this  rerouting  policy  can  be  used  in  actual 
networks,  under  moderate  to  heavy  load  conditions,  with  acceptable  overheads  to 
achieve  a  neutral  traffic  matrix. 

7.6     Performance  of  Random  Routing  Strategy 

In  this  section,  we  present  the  simulation  results  of  random  routing  strategy  as  a 
basis  of  comparison  with  the  performance  of  the  heuristic  algorithm  to  achieve  neutral 
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Figure  7.15.  Transmission  Cost:  UFNET  Traffic  Trace 
traffic  matrix.  We  use  a  two-phase  routing:  in  the  first  phase,  each  packet  is  routed 
to  a  randomly  chosen  node;  in  the  second  phase,  it  is  rerouted  to  the  destination 
from  the  intermediate  node.  If  the  destination  node  at  the  end  of  the  first  phase  is 
also  the  actual  destination  of  the  packet,  then  the  second  phase  has  no  effect.  Thus 
a  rerouted  packet  experiences  exactly  one  extra  hop  to  reach  its  destination. 

Consider  any  node,  i,  in  the  network,  whose  traffic  is  represented  by  the  element 
T[i,  j]  in  the  traffic  matrix.  This  node  can  transmit  packets  to  every  node  in  the  same 
row  and  can  receive  packets  from  every  node  in  the  same  column.  Assuming  that  the 
traffic  is  evenly  distributed  and  that  the  first  phase  of  the  routing  process  is  random, 
we  expect  the  traffic  from  node  i  to  be  evenly  distributed  to  the  other  n  —  1  nodes  in 
the  network.  Similarly,  we  can  expect  that  node  i  receives  packets  from  every  other 
n  —  1  node  in  the  same  column  of  the  traffic  matrix.  However,  due  to  nonlocal  effects 
of  rerouting,  it  is  not  possible  to  guarantee  a  spatially  neutral  final  traffic  matrix. 
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The  value  of  random  routing  is  in  its  ability  to  generate  an  acceptable  final  traffic 
matrix  without  the  necessity  to  compute  the  reroute  quantities  as  in  the  case  of  the 
heuristic  algorithm.  The  random  routing  policy  also  lends  itself  well  to  distributed 
implementation. 

Figure  7.16  through  figure  7.19  shows  the  performance  of  random  routing  with 
respect  to  the  heuristic  rerouting  algorithm  discussed  in  figure  7.1  for  UFNET  traffic 
distribution.  From  the  figures,  it  can  be  seen  that  the  cost  of  transmission  for  the 
random  rerouting  strategy  is  almost  twice  the  cost  of  the  original  traffic  matrix  and  is 
almost  half  the  cost  of  deriving  a  neutral  traffic  matrix.  Random  rerouting  strategy 
and  the  heuristic  algorithm  from  spatially  neutral  traffic  matrix  still  perform  better 
than  simple  padding.  From  this  we  conclude  the  random  rerouting  is  preferable  to 
padding  and  should  be  adopted  as  the  minimum  protection  against  traffic  analysis. 
In  cases  where  the  perceived  threat  of  traffic  analysis  is  ubiquitous,  the  additional 
transmission  cost  and  computational  overhead  to  derive  a  neutral  traffic  matrix  is 
justified. 

7.7     Conclusion 

A  heuristic  algorithm  to  obtain  spatially  neutral  traffic  matrix  is  proposed  and 
expected  load  cost  for  padding  only  and  padding  after  rerouting  computed. 

Our  experiments,  using  uniform  traffic  distribution,  have  shown  that  the  load 
cost  of  the  spatially  neutral  traffic  matrix  as  compared  with  the  original  traffic  matrix 
obtained  with  padding  only  and  with  padding  after  rerouting,  is  four  and  two  times 
respectively.  Similarly,  with  UFNET  traffic  distribution,  the  costs  were  four  and  eight 
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Figure  7.16.  Transmission  Cost:  Random  Routing,  UFNET  Traffic.Volume  0-15 
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Figure  7.17.  Transmission  Cost:  Random  Routing,  UFNET  Traffic,  Volume  0-25 


141 


Padding  Cost 

Exp.  Padding  Cost  +++ 


Reroute  Cpst  _._._. 

Exp.  Min.  Cost  (Reroute+Pab) .... 

Random  Reroute  Cost  *" 

Traffic  Volume:  0-50 

Original  Cast  ■ 


Number  of  Nodes,  r 


Figure  7.18.  Transmission  Cost:  Random  Routing,  UFNET  Traffic,  Volume  0-50 
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Figure  7.19.  Transmission  Cost:  Random  Routing,  UFNET  Traffic,  Volume  0-100 
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times  respectively.  This  increase  in  cost  is  due  to  the  sparsity  of  the  UFNET  traffic 
matrix  and  the  wide  variation  in  the  traffic  volumes  among  nodes  in  the  subnet. 

An  integer  programming  formulation  of  the  heuristic  algorithm  is  given  and 
implemented  in  GAMS.  We  see  that  the  performance  of  the  heuristic  algorithm,  in 
terms  of  the  load  cost  of  the  neutral  traffic  matrix,  for  smaller  traffic  matrices  is 
within  10  percent  of  the  optimal  cost  and  for  larger  traffic  matrices  is  within  30 
percent  of  the  optimal  cost. 

To  evaluate  the  efficiency  of  the  rerouting  strategy,  we  simulated  a  random 
routing  strategy  as  a  basis  of  comparison.  Using  random  routing  strategy,  it  was  not 
required  to  obtain  a  spatially  neutral  traffic  matrix.  Using  UFNET  traffic  charac- 
teristics as  input  to  the  simulation  model,  we  found  that  the  load  cost  of  the  final 
(non-neutral)  traffic  matrix  was  only  twice  as  much  as  the  original  traffic  matrix,  as 
compared  to  four  times  the  cost  of  the  original  traffic  matrix  using  the  heuristic  al- 
gorithm. Depending  on  the  threat  perceived  by  the  network  and  other  performance 
considerations,  one  may  select  either  the  random  routing  strategy  or  the  heuristic 
routing  strategy. 

The  time  and  space  complexity  for  this  algorithm  is  0(n2).  For  an  original  traffic 
matrix  representing  a  large  network  (80  nodes),  an  implementation  of  algorithm  takes 
approximately  1.2  seconds  on  a  Sun  4  to  compute  the  neutral  traffic  matrix  in  both 
cases:  padding  only  and  padding  with  rerouting.  On  the  other  hand,  the  integer 
programming  implementation  took  approximately  80  minutes  for  a  9  node  network. 


143 


We  expect  the  duration  of  transmission  to  be  in  the  order  of  few  tens  of  minutes; 
having  seen  that  the  heuristic  algorithm  is  capable  of  computing  the  reroute  quantities 
in  the  order  of  one  second  and  having  the  ability  to  respond  to  variations  in  load  every 
10  to  20  seconds  (from  capacity  analysis),  we  conclude  that  the  algorithm  can  deliver 
spatial  and  temporal  neutrality  within  acceptable  cost  and  performance  penalties  in 
an  actual  network. 


CHAPTER  8 
CONCLUSIONS  AND  FUTURE  WORK 


8.1     Summary 

Study  of  covert  channels  in  computer  systems  is  gaining  importance  and  be- 
coming practically  viable  with  the  availability  of  tools  and  mechanisms  to  identify 
and  contain  most  of  the  storage  and  simple  timing  channels  that  exploit  traditional 
computer  system  resources.  However,  there  has  not  been  much  effort  in  identifying 
and  containing  covert  channels  arising  in  communication  subsystems.  We  address 
this  problem  and  identify  the  covert  channels  that  exist  due  to  spatial  and  temporal 
variation  in  traffic  characteristics. 

Prevention  of  traffic  analysis  on  communication  networks  can  be  effectively 
achieved  using  the  model  presented  in  this  thesis.  Characterization  of  temporal 
variation  in  traffic  characteristics  is  a  significant  step  in  our  attempts  to  prevent 
traffic  analysis  and  associated  covert  channels.  Our  model  for  the  high  level  preven- 
tion of  traffic  analysis  eliminates  spatial  covert  channels.  Static  transmission  sched- 
ule smoothes  out  temporal  variations  in  traffic,  effectively  containing  the  remaining 
covert  channels  at  the  expense  of  system  responsiveness.  To  maintain  an  acceptable 
level  of  system  performance  and  quality  of  service,  an  adaptive  scheduling  policy  is 
defined. 
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We  propose  formal  and  informal  techniques  to  estimate  covert  channel  capacity. 
General  formula's  are  given  to  compute  the  maximum  covert  channel  capacity  un- 
der our  model.  Maximum  channel  capacity  has  been  computed  for  various  network 
scenarios  and  useful  conclusions  derived.  Covert  channels  that  persist  due  to  tem- 
poral variation  are  audited  and  various  handling  mechanisms  proposed  to  reduce  the 
capacity  of  covert  channels  to  TCSEC  acceptable  levels. 

Performance  analysis  of  the  algorithm  to  derive  neutral  traffic  matrix  is  com- 
pared with  a  linear  integer  program  formulation  of  the  problem.  A  random  routing 
strategy  was  also  tested  to  derive  target  traffic  matrices  that  were  not  spatially  neu- 
tral but  were  effective  in  prevention  of  traffic  analysis.  Performance  analysis  done 
on  the  model  using  traffic  characteristics  from  measurements  on  UFNET  indicate 
that  the  model  can  used  effectively  in  actual  networks  under  moderate  to  heavy  load 
conditions  to  prevent  traffic  analysis. 

It  is  our  belief  that  this  model  presents  an  framework  to  address  the  traffic 
analysis  problem  and  associated  covert  channels.  However,  there  are  several  related 
issues  that  needs  further  attention. 

The  interactions  between  the  routing  algorithm  used  by  the  network,  congestion, 
queuing  delay  and  traffic  analysis  countermeasures  are  interesting.  The  rerouting  al- 
gorithm introduces  some  additional  packets  to  the  total  network  traffic  due  to  the 
nonlocal  effect  of  rerouting  a  packet  from  a  given  source-destination  via  an  interme- 
diate node.   This  and  the  dummy  packets  may  lead  to  congestion  or  queuing  delay. 
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However,  as  the  traffic  load  at  each  node  in  the  final  traffic  matrix  is  equal,  by  de- 
signing the  system  for  worst-case  conditions  we  may  be  able  to  avoid  congestion. 
As  we  have  assumed  a  completely  connected  network,  we  have  a  point-to-point  link 
between  each  pair  of  node  in  the  network.  Routing  of  packets  from  source  to  destina- 
tion is  trivial  in  this  case.  In  a  partially  connected  network,  there  is  certainly  some 
interaction  between  rerouting  and  the  routing  algorithm  used.  For  example,  it  may 
be  desirable  to  reroute  traffic  between  a  given  source-destination  pair  via  a  restricted 
set  of  intermediate  nodes.  Such  a  constraint  is  useful  if  the  objective  is  to  maximize 
the  usage  of  a  low-cost  link  (optimizing  cost)  or  if  not  all  the  nodes  in  the  network 
agree  to  share  the  costs  of  traffic  analysis  prevention. 

One  primary  limitation  in  the  above  approach  is  the  fact  that  the  computa- 
tion of  the  neutral  traffic  matrix  (reroute  quantities)  is  not  distributed  and  for  any 
real  time  system  with  reasonable  expectations  of  fault-tolerance,  we  would  prefer  to 
distribute  computation  of  the  traffic  matrix.  Ideally  we  should  be  able  to  make  all 
(reroute/padding)  decisions  locally. 

The  feasibility  of  assigning  dynamic  priority  to  packets  and  its  effects  on  the 
scheduling  policy  must  be  studied  and  appropriate  techniques  to  guarantee  better 
quality  of  service  must  be  developed.  Further  research  is  also  required  in  developing 
optimal  scheduling  policies  for  specific  types  of  networks  and  validation  of  analytical 
results  by  simulation  studies. 

The  network  considered  in  this  model  was  completely  connected;  other  types  of 
network  could  be  considered  as  well.   Also,  it  may  be  desirable  to  have  non-neutral 
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apparent  traffic  matrices,  particularly  if  the  threat  posed  by  the  eavesdropper  is  less 
ubiquitous  than  this  model  assumes.  The  linear  programming  approach  given  will 
find  rerouting  solutions  in  this  case  as  well. 

8.2     Future  Research 

The  first  concerns  the  nature  of  the  apparent  traffic  matrix  presented  by  the 
model  to  the  eavesdropper.  The  final  traffic  matrix  need  not  be  constrained  to  be  a 
neutral  traffic  matrix.  It  may  suffice  to  construct  the  final  traffic  matrix  so  that  the 
intruder  cannot  estimate  the  real  traffic  in  the  network  even  if  he  has  eavesdropped 
on  each  link  of  the  network  and  has  constructed  the  complete  final  apparent  traffic 
matrix.  Simulation  of  the  random  routing  strategy  is  a  step  in  this  direction,  but  a 
more  formal  analysis  of  the  routing  strategy  must  be  done.  Also  issues  such  as  the 
"neutrality"  of  the  traffic  matrix  obtained  after  random  routing  must  be  quantified 
and  measures  must  be  taken  to  guarantee  quality  of  service  considerations. 

Secondly,  though  the  performance  analysis  of  the  model  used  simulation  parame- 
ters from  measurements  done  on  UFNET,  for  simplicity  we  ignored  several  important 
interactions  of  the  model  with  the  network  protocols.  Issues  related  to  interactions 
between  the  routing  algorithm  used  by  the  network,  congestion,  queuing  delay  and 
traffic  analysis  countermeasures  are  interesting  and  need  further  study. 

Any  model  seeking  to  address  the  traffic  analysis  problem  must  either  be  a  part  of 
or  interact  with  Trusted  Computing  Base  (TCB)  and  a  Network  Trusted  Computing 
Base  (NTCB)  environment  via  a  secure  channel.  An  implementation  of  the  model  on 
an  actual  network  will  not  only  bring  into  focus  the  above  mentioned  interactions,  but 
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will  also  be  a  useful  exercise  in  studying  the  feasibility  of  using  this  model  to  secure 
the  traffic  characteristics  of  a  subset  of  nodes  within  an  existing  network.  Though  we 
believe  that  the  model  is  capable  of  scaling  in  both  directions,  the  important  problems 
to  be  addressed  in  such  a  scenario  include  the  determination  of  reroute  quantities  in 
a  distributed  manner,  the  modification  to  transmission  schedules  and  the  higher  level 
security  considerations  arising  out  of  overlapping  administrative  domains. 
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